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Animal farm 


Europe’s policy-makers must not buy animal-rights activists’ arguments that addiction is a 


social, rather than a medical, problem. 


alterations in regions crucial to learning and memory, judge- 

ment and decision-making, and behavioural control. Drugs 
imitate natural neurotransmitters, resulting in false or abnormal mes- 
sages being sent around neural circuits. The brain’s central reward sys- 
tem is overstimulated and flooded with dopamine. The brain adapts 
to this flood by turning down its ability to respond to dopamine — so 
addicts take more and more of the drug to push dopamine levels higher. 

Changes in other reward-system neurotransmitters such as gluta- 
mate can impair cognitive function. And the triggering of subcon- 
scious memory systems leads to conditioning, so environmental cues 
such as particular people or places set off uncontrollable cravings. 

None of that is particularly controversial, at least among scientists. 
So why do a growing number of politicians in Europe want to curtail 
research into addiction? Why would they deny their constituents the 
hope that they or their loved ones might one day be helped with the 
terrible burden of this disease? 

The answer is a troubling new front in the long battle over the use 
of animals in research (see page 24). Campaigners opposed to animal 
research have targeted addiction as the soft underbelly of political 
support for such work. Addiction is a social problem, they argue, not 
a medical one. And social problems are not solved by science, or by 
research on animals. 

That is a seductive message for politicians. Care and compassion for 
drug addicts is rarely a vote-winner. Care and compassion for animals is 
asure thing. Many voters believe that funds are best focused on crushing 
drug barons and locking up dealers. Many also believe that addicts are 
at best weak-minded, at worst evil, and have only themselves to blame if 
their drug habits kill them. Ifthe science of addiction can be questioned, 
then why bother pursuing medical cures based on scientific research? 

(Fora taste of the muddled thinking on offer here, search the Inter- 
net for a recent ‘debate’ on addiction featuring the journalist Peter 
Hitchens and the actor Matthew Perry, broadcast by the BBC’s current- 
affairs programme Newsnight.) 


D rug addiction is a disease. Images of the brains of addicts show 


DANGEROUS DECREE 
Flawed, unscientific thinking on addiction has already produced a 
decree in Italy, expected to become law next month, which bans the use 
ofall animals in addiction research — despite vociferous objections from 
the scientific community. The dozen or so Italian groups working in this 
area will have three years to phase out their research, and other scientists 
hoping to develop a drug for any brain-related disorder, from anxiety 
to migraines, will no longer be able to generate safety data required by 
regulatory agencies on their addictive potential in animals. In Belgium, 
the government is rushing through legislation that would ban addiction 
research using monkeys — again in the face of objections from scientists. 
Let’s be clear: research using animals has been central to our 
understanding that drug addiction is a chronic, relapsing disease that 


changes the structure and function of the brain; that an individual's 
genetic make-up accounts for around half of the vulnerability to 
addiction; and that environmental factors are crucial in precipitating 
addictive behaviours in the vulnerable. Environmental factors include 
stress at critical developmental stages, from the womb, through early 
childhood, to adolescence. Without animal research — including work 
on primates, whose brains are most like our own — it will not be pos- 
sible to go further and discover exactly how the neural circuitry in an 
individual brain is shaped by interacting genetic and environmental 
forces. It isan extremely tough nut to crack — but it must be cracked. 
Research using animals is, rightly, a per- 


“Winning the ennially sensitive issue. But to claim that 
war against animals may not be used specifically for 
the misuse of addiction research is to define those affected 
drugs requires by the disorder as less worthy of care and 
us to address concern than those with other disorders. 
demand as well Politicians cannot choose to ignore scien- 


tific evidence and then claim that they do 
not know that addiction is a disease. Ani- 
mal-rights campaigners have unleashed a dangerous argument. It 
must be stopped in its tracks — and quickly. 

Winning the war against the distress and damage caused by misuse 
of drugs requires diverse approaches that address demand as well as 
supply. It may not seem intuitive to those witnessing the misery and 
violence around the drug world — in the United States alone, illicit 
drug use costs more than US$190 billion a year in crime, increased 
health-care costs and lost productivity — but it is likely that demand 
can be reduced by developing treatments for the self-destructive crav- 
ings that drive drug addiction. Given the technical tools now available 
for looking deep inside the brain, there is realistic hope that such treat- 
ments will emerge from research in the coming decades. 

The work must continue. Europe should look to the United States 
and to inspirational figures such Nora Volkow, head of the US National 
Institute on Drug Abuse in Bethesda, Maryland, who regularly testifies 
on the science of addiction to the US Congress to justify the institute's 
research budget. 

Volkow — a neuroscientist born in Mexico, a country blighted 
by drug wars — has the scientific clarity of vision, and the relentless 
patience, to be able to argue for the promise of research effectively 
year in, year out. Such wisdom also exists in Europe, but politicians 
too frequently ignore it. 

Volkow is the great-granddaughter of Leon Trotsky, the Russian 
revolutionary who was famously assassinated in 1940 — in the fam- 
ily home in Mexico in which Volkow herself grew up. She fights for a 
different cause: rational drug politics. Her European counterparts are 
fighting too. All governments must pay attention. 

Diseases can be cured. People affected can be helped by science and 
research — and yes, by the use of animals. Addiction is no different. = 


as supply.” 
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Invisible borders 


UK immigration rules are perceived as being 
tougher than they really are. 


immigration — just movement. National borders are notional; 
passports indicate where someone was born, not where they 
belong or where they can do the most good. 

Politicians, and much of the public, do not see immigration like that. 
They see strain on services and overcrowded job markets and, in a few 
cases, some use these legitimate concerns as thin masks for prejudice, 
xenophobia and racism. As economic difficulties bite, governments 
frequently promise to ‘get tough’ on immigration, often turning a blind 
eye to the benefits of an influx of people as they do so. 

In 2010 the warnings were stark. The United Kingdom's new, firmer 
stance on immigration could “spell disaster for UK science’, warned 
the Royal Society of Chemistry. Nobel laureates voiced similar con- 
cerns in the national press. The Campaign for Science and Engineering 
(CaSE) mobilized to keep the United Kingdom ‘open for business, 
with calls to give priority to visa applications from overseas scientists. 

Nature too spoke out about the risk posed by crude measures to curb 
immigration, which could potentially scupper the ability of some of 
Britain’s leading research laboratories to recruit the best people (see 
Nature 468, 346; 2010). 

To the UK governments credit, it heeded the warnings and made 
exceptions for scientists, loosening the rules to grant them entry under 
circumstances that would cause other migrants from different profes- 
sions to be turned away. A new scheme for ‘exceptionally talented’ 
scientists and artists was created. Some 700 scientists and 300 artists 
a year would be allowed in. Problem solved? 

Apparently not. The United Kingdom is again in the throes ofa politi- 
cal debate about the benefits and problems of immigration, and science 
lobby groups are again worried that researchers will be caught in the 
crossfire. As we report on page 14, these groups recently badgered the 
Home Office at a meeting on the subject, and some made the same 
doom-laden predictions. 

Unfortunately, the campaigners are on less solid ground this time. 
The exceptional-talent route has mainly been exceptional in its 


ie an international business such as science, there is no true 


underuse: the latest figures show that by June 2013 only 89 people 
(both scientists and artists) had used it. Indeed, one researcher told 
Nature that his visa is so rare that “when I re-enter the UK, the border 
staff always comment ‘Oh, I’ve never seen one of those”. 

Hundreds of places on the scheme remain unfilled. And other conces- 
sions remain. In fact, scientists are in a better position than just about 
anyone else who wishes to move to and work in the United Kingdom, 
with the possible exception of international soccer stars. And even there, 
similar rhetoric about the effect on domestic talent has led to footballers 
being recently refused work permits. 


“Scientists arem The situation is not perfect — far from it. 
a better position Many academics have justifiable gripes with 
thanjust about the UK Border Agency and its visa processes 
anyone else who ——a postdoc forced out of the country near- 
wishes to move penniless perhaps, or an eminent colleague 
to and work scheduled to deliver a keynote speech at a 
in the United conference turned away. Such difficulties 


are common throughout most of the world. 
The United States is currently grappling with 
how to keep happy the thousands of scientists who are unable to obtain 
green cards for permanent residency every year. 

‘Highly skilled’ migrants are usually singled out for praise when poli- 
ticians confront immigration, but the subject is a notorious minefield 
and is hard for politicians to navigate with rational arguments. Image, 
perceived approach and rhetoric about being ‘tough buy popularity 
here perhaps more than in any other political sphere — whatever the 
evidence may say. 

After the recent meeting with the Home Office, CaSE said that such 
“messaging” from the UK government about making it harder for immi- 
grants could itself deter leading scientists from coming. That could 
explain, for example, why piles of the exceptional-talent visas remain 
unused in the drawers of the Border Agency. But seen another way, the 
government is making it possible for scientists to come, whereas it is the 
campaigners who organize open letters and give media-friendly brief- 
ings about how hard it is for them to do so. Pressure groups have one 
weapon — pressure — and it is one that can be as crude as any political 
rhetoric. If messaging is the problem, then the campaigners must be 
cautious about the message that they themselves send. 

There are real problems with the movement of scientists across 
borders, and campaigners are right to highlight them. Nature will 
continue to press for such obstacles to be removed — the real and the 
rhetorical. m 


Kingdom.” 


Trick of the light 


The Amazon doesn’t absorb extracarbonin the 
dry season after all. Itcan become acarbon source. 


of ingredients for photosynthesis: carbon dioxide, water and 

light. In truth, the equation is a little more complicated than 
that, and when photosynthesis proceeds on a truly massive scale, these 
complications can have huge implications. 

Take, for example, the world’s largest mass of concentrated photo- 
synthesis: the Amazon rainforest of South America. Scientists have 
long struggled to work out whether the rate of photosynthesis there 
is controlled by the available amount of water or of sunlight. (Over 
seasonal timescales, that is — on a 24-hour cycle, it is controlled by 
the availability of sunlight.) 

The uncertainty was triggered by a surprising result from satellite 
images, which seemed to show that Amazon forests became greener 
during the dry season, and greenest of all during years of severe drought 


B udding biologists learn early the apparently simple holy trinity 
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such as 2005 (S. R. Saleska et al. Science 318, 612; 2007). More green 
means more photosynthesis, so this result suggested that it was the 
availability oflight, and not water, that was the controlling factor. Clear 
skies and sunny weather were more important than moisture in the soil. 

Ina study published on Nature’s website today (D. C. Morton et al. 
Nature http://dx.doi.org/10.1038/nature 13006; 2014), researchers 
show that this is, literally, an illusion. The forest does not become 
greener during dry periods at all. It just looks that way when the sensor 
and the Sun are both in the south of the sky. It is not photosynthesis 
that drives the apparent greening of the forest at such times, but a lack 
of shadow. 

The finding drags attention away from the importance of light in 
the Amazon's photosynthesis equation, and towards the need for 
water. But what of the third point of the triangle, carbon dioxide? 
There is uncertainty there too: this time over whether in years of 
drought, the trees will switch from being a net carbon sink toa source, 
which could worsen global warming. A second study of the Amazon, 
on page 76, offers the latest data on this debate, 
and the news is not good. Fire and drought 
can indeed make the Amazon a net source of 
atmospheric carbon — whatever colour it is at 
the time. m 
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YULAN LI 


WORLD VIEW .yecnicossen 


of 120 million hectares of arable land to feed its people. That 
is the ‘red line’ for food security that officials have pledged 
to protect. 

So it may seem like good news that the most recent comprehensive 
survey of national land use in China has reported a healthy surplus — 
some 135 million hectares of the country are classed as planted with 
crops — rice paddy fields, irrigable land and dry farms. Simultane- 
ously, total grain production hit a record 602 million tonnes in 2013, 
after a decade of continuous growth. 

I fear these figures are not as positive as they seem. Moreover, I worry 
that they may create a false sense of security and encourage policy-mak- 
ers to relax efforts to protect China’ arable land. We should not be mis- 
led by the superficial surplus. The story is not so simple. Although the 
quantity of arable land in China seems healthy, 
there are serious concerns about its quality — and 
of its ability to supply future generations with 
enough food. 

I was involved in the land-use assessment — 
the second National Land Resource Survey of 
China — and am pleased to see the results pub- 
lished and discussed. The survey completed its 
work in 2009, but the previous central govern- 
ment declined to publish the results, because 
members did not agree with the findings of the 
first such survey, finished in 1996. 

This is not unusual. Surveys to classify large 
areas of remote land are difficult. A study pub- 
lished in December suggested that the area of 
cropland abandoned since 1990 in western Rus- 
sia, Belarus and Ukraine has been severely underestimated. 

When President Xi Jinping came to power in 2012, he investigated 
the discrepancy in the Chinese figures and decided that the results of 
the second survey were robust, because they are based on high-reso- 
lution remote sensing and backed up by investigation on the ground. 
He authorized China’s land and resources ministry (MLRC) to release 
the results at the end of 2013. At a press conference on 30 December, 
officials from the MLRC pledged to continue to protect arable land, 
and reinforced their commitment to the food security red line. 

Beyond the headline figures, there are some worrying trends. 
Although the overall area of arable land has increased in the time 
between the two surveys, the quality of the land, and so its suitability, 
has decreased. Some 3 million hectares of high-quality arable land 
and some 1 million hectares of paddy land have been built on or con- 
verted to urban use in just over a decade. More 


A ccording to the Chinese government, China needs a minimum 


than 3 million hectares have been contaminated SNATURE.COM 
with pollution. The effects were shown starkly _ Discuss this article 
last year, when heavy metals such as cadmium ___ online at: 
appeared on the tables of restaurants asaresult —_go.nature.com/nzytqp 


CHINA IS GROWING 
MORE FOOD ON 


LESS LAND, 


A SITUATION 
THAT LEAVES 


LITTLE SCOPE 


FOR EXPANSION. 


China must protect 
high-quality arable land 


Figures from anational survey of land use seem positive, but the effort 
exposed some worrying trends, says Xiangbin Kong. 


of rice being planted in polluted fields in Hunan province. 

Arable land lost to development and contamination is frequently 
replaced by marginal and lower-quality alternatives — although land 
surveys such as ours do not distinguish between them. Of the land 
identified as arable in the latest figures, more than 4 million hectares 
in the southwest of the country are high in the mountains. And almost 
6 million hectares are in converted forest and grassland in the north, 
an ecologically fragile flood zone. Broadly, there has been a shift from 
growing crops in China's warm and humid south to the less suitable 
cold and water-limited north. 

The amount of available land has peaked. There is no spare high- 
quality arable land that can be cultivated as existing farmland is lost to 
development. Further conversion of grassland and forest produces low- 
grade alternatives, at great ecological cost. Rather than signalling secu- 
rity, the new land-use figures show that China is 
overusing its remaining high-quality arable land. 
It is growing more food on less land, a situation 
that leaves little scope for expansion — and little 
in reserve as water shortages reduce yields in the 
north still further. 

China needs to act to preserve its remaining 
high-quality arable land by classifying tracts of 
land as for permanent arable use, particularly 
in the southeast and in the suburbs of big cities. 
Restrictions should be put on development there, 
and greater efforts made to prohibit the agricul- 
tural conversion of marginal land in the north. 
Together, this would slow the agricultural shift 
towards the north and buy China some time. 

China should also rethink its existing protec- 
tion policy for arable land, which contributes to the problem because 
its programmes focus only on individual administrative regions. 
Instead, China should set aside crop production ‘priority zones’ at a 
national, provincial and county level on the basis of the arable land’s 
potential grain productivity and distribution. 

These zones should introduce protection for other types of farm- 
land, increase the subsidies paid to households that increase crop pro- 
duction per hectare, and make available funds for reclamation and 
restoration of degraded land to produce crops. This is a key point. 
China must work harder to improve the quality of low- and medium- 
grade arable land, which the country will increasingly rely on to feed 
itself. Better rural roads, forest management, and more irrigation 
canals and ditches are less newsworthy than headline announcements 
about record crop production, but, in the medium and long term, they 
will be more useful. m 


Xiangbin Kong is a land-use scientist at the College of Resources and 
Environment, China Agricultural University, Beijing. 
e-mail: kxb@cau.edu.cn 
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RESEARCH HIGHLIGHTS 


Prion strings 
pictured on cells 


For the first time, researchers 
have captured images of 
prions — proteins that can 
misfold and spread, causing 
neurodegeneration — in living 
cells. The images show the 
proteins residing on the cell 
surface in strings and webs. 
Albert Taraboulos at 
the Hebrew University in 
Jerusalem and his colleagues 
used antibodies that react 
with a subset of the misfolded 
proteins to visualize the 
prions in cultured mouse 
cells and brain tissue under a 
fluorescence microscope. The 
team found prion strings up 
to five micrometres long that 
remained stable on the cell 
surface for several hours. 
This anchoring provides 
insight into how misfolded 
prions interact with cells and 
can resist degradation, the 
authors say. 
J. Cell Biol. http://dx.doi. 
org/10.1083/jcb.201308028 
(2014) 


Hydrogen river 
could fuel stars 


The discovery ofa faint 
filament of hydrogen gas 
streaming across space could 


Selections from the 
scientific literature 


BIOTECHNOLOGY 


CRISPR makes modified monkeys 


Researchers have used precise gene-editing 
techniques to generate genetically modified 


monkeys. 


Previous models of human disorders in 
monkeys were created using viruses to transfer 
genes, but this method lacks the precision 
needed to modify specific gene sequences. 
Xingxu Huang at Nanjing University in China 
and his colleagues turned to the CRISPR- 
Cas9 system, which uses a customizable RNA 
fragment to guide the DNA-cutting enzyme 
Cas9 to a specific site. The team altered 


help to explain how some 
galaxies maintain their pace of 
star formation. 

D. J. Pisano from West 
Virginia University in 
Morgantown used the 
Robert C. Byrd Green Bank 
Telescope to identify a river 
of hydrogen connecting the 
galaxy NGC 6946 (pictured) 
with its neighbours. Pisano 
suggests that the filament 
could be the first observation 
of a ‘cold flow; a stream of 
diffuse gas from intergalactic 
space that has long been 
theorized to be a source of fuel 
for star formation, and that is 
invisible to most telescopes. 

Alternatively, the hydrogen 
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the genome in one-cell-stage embryos of 
cynomolgus monkeys (Macaca fascicularis). 


This resulted in the birth of twins (pictured) 


com/327cbd 


could have been drawn out 
during a close encounter 
between NGC 6946 and its 
neighbours. Future galaxy 
surveys should confirm the 
source of this hydrogen stream. 
Astronomical J. 147, 48 (2014) 


Britain’s Anglo- 
Saxons were local 


Anglo-Saxons succeeded the 
Romans in Britain during the 
early fifth century, probably 
through cultural adoption 

by local individuals rather 
than through invasion by 
Germanic people. 
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with mutations in two target genes: Ppar-y, 

which is involved in regulating metabolism; and 

Rag1, which is involved in immune function. 
The results pave the way for producing 

primate models with specific mutations that 

more closely mimic human diseases. 

Cell http://doi.org/q93 (2014) 

For a longer story on this research, see go.nature. 


Susan Hughes at the 
US Navy in Silverdale, 
Washington, and her team 
analysed the tooth enamel 
of 19 individuals from an 
early Anglo-Saxon cemetery 
in southern England, and 
measured the levels of oxygen 
and strontium isotopes in 
the teeth. These levels are 
determined by the water 
and food consumed by the 
individual. The researchers 
found that the isotope 
ratios matched those of the 
surrounding water and soil, 
suggesting that most of the 
people were local to that area. 
One individual seemed to 
be an immigrant from the 
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European continent. 

The team says that its 
findings support the idea that 
Britain's first Anglo-Saxons 
were locals who rapidly 
shifted cultures after the fall 
of Roman Britain. 

J. Arch. Sci. 42, 81-92 (2014) 


Lizards socialize 
to thrive 


Social isolation in early life 
could impair the development 
of reptiles, according to a study 
of chameleons. 

Social behaviour is well 
documented in mammals 
and birds, but it is not so 
firmly corroborated in cold- 
blooded vertebrates. Cissy 
Ballen and her colleagues 
at the University of Sydney 
in Australia compared the 
social interactions of veiled 
chameleon (Chamaeleo 
calyptratus) hatchlings raised 
in isolation with those raised 
ina group setting. The authors 
found that socialized lizards 
were less submissive, displayed 
brighter and more saturated 
colours when encountering 
new chameleons, and captured 
food more quickly than did 
lizards raised in isolation. 

The findings add to evidence 
challenging the conventional 
view that reptiles are capable of 
only simple social behaviour. 
Anim. Behav. http://doi.org/q9h 
(2014) 


Night life fosters 
foul sprays 


Carnivores that spray foul 
anal secretions might have 
evolved this ability in response 
to night-time predation from 
other mammals. 

Theodore Stankowich at 
California State University in 
Long Beach and his colleagues 
looked at the behaviour of 
181 species of carnivorous 
mammals and their predators. 
The authors found that 
carnivores are targeted mainly 
by other mammals at night and 
by birds of prey during the day. 
Animals that are active during 


the day are more likely to 
develop tight-knit social groups 
that are better at detecting and 
warding off predators. 

Nocturnal animals cannot 
rely on early visual detection 
and instead use short-range 
defence systems such as 
noxious sprays, which are 
more effective against other 
mammals than against birds. 
Evolution http://doi.org/q9w 
(2014) 


| NEUROSCIENCE 
Pruning problems 
alter brain wiring 


Abnormal pruning of neuronal 
connections might stall brain 
maturation, resulting in 
reduced brain connectivity 
and even behaviours linked to 
disorders such as autism. 
Cornelius Gross at the 
European Molecular Biology 
Laboratory in Monterotondo, 
Italy, and his colleagues studied 
mice that were engineered to 
have fewer microglia — non- 
neuronal brain cells that trim 
back synapses, or neuronal 
connections, during brain 
development. These animals 
had fewer synapses between 
neurons and decreased 
connectivity between brain 
regions, and seemed to be less 
social in behavioural tests. 
Microglia and synaptic 
pruning are important for 
normal brain development, 
and problems with this 
pruning could lead to 
neurodevelopmental 
disorders, the authors say. 
Nature Neurosci. 
http://doi.org/rbf (2014) 


How big galaxies 
died fast 


Astronomers have worked out 
the origin of giant galaxies that 
seemed to have fizzled early 
in the Universe's history, just 
three billion years after the 
Big Bang. 

To find out how massive 
elliptical galaxies became 
so big and stopped forming 
stars so quickly, Sune Toft 
of the Niels Bohr Institute 
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United States tops warming list 


The United States is the largest national 


ef ee ber oye contributor to global climate warming, 
29 Dec-28 Jan. followed by China, Russia, Brazil and 
India. 


Damon Matthews and his colleagues at Concordia 
University in Montreal, Canada, analysed the national 
emissions of greenhouse gases and aerosols, including those 
from land use, between 1800 and 2005. They calculated that a 
total warming of 0.7°C occurred during this period, and that 
more than 21% of this total is linked to the United States. China 
and Brazil exceed the United States slightly in terms of their 
contributions from land-use activities, such as deforestation 
and agriculture, but the high level of cumulative US fossil-fuel 
use makes the country the biggest contributor overall. 

Among the major emitters, the United Kingdom and the 
United States top the rankings on a per capita basis, with 
contributions that are more than ten times higher than those 


of either China or India. 


Environ. Res. Lett. 9,014010 (2014) 


in Copenhagen and his 
colleagues compared samples 
of these dead galaxies and an 
earlier generation of star- 
forming ones observed with 
the Hubble, Herschel and 
Spitzer space telescopes. The 
authors conclude that earlier, 
gas-rich galaxies merged, 
kicking off intense star 
formation that rapidly used 
up all the gas, resulting in the 
large, burnt-out galaxies. 
Astrophys. J. 782, 68 (2014) 


Tiny cracks 
toughen up glass 


Glass etched with intricate 
micropatterns is much 
tougher than normal glass, 
report Francois Barthelat 
and his colleagues at McGill 
University in Montreal, 
Canada. 

The researchers were 
inspired by natural materials 
such as tooth enamel and 
nacre in mollusc shells, which 
are stiff and hard, but not 
brittle. In these structures, 
cracks are unable to spread 
rapidly because they are forced 
to travel along tortuous or 


interlocking channels that 
are criss-crossed by proteins 
holding the structure together. 
The researchers etched 
similar patterns into glass 
(pictured) and filled in the 
gaps with shock-absorbent 
polyurethane, creating a 
material that is 200 times 
tougher than standard glass. 
The approach could be used 
to make brittle materials such 
as ceramics shatter-resistant. 
Nature Commun. 5, 3166 (2014) 
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SEVEN DAYS nese 


Dengue control 


Panama has joined a 

handful of nations trying 

to combat dengue fever 

using genetically modified 
mosquitoes developed by 
Oxitec, a biotechnology firm 
in Oxford, UK (see go.nature. 
com/tht55x). The company 
announced on 28 January that 
the Panamanian government 
had approved open-field trials 
of the insects, engineered 

to be sterile, as a means of 
suppressing wild populations 
of the dengue-carrying 
mosquito (Aedes aegypti). 

At the end of 2013, Panama’s 
health minister declared that 
the country was experiencing 
a dengue epidemic. 


Oil pipeline 

On 31 January, the US 
Department of State released 
its final environmental 
assessment of the proposed 
Keystone XL pipeline — a 
controversial project that 
would link oil sands in 
Alberta, Canada, to refineries 
along the Gulf of Mexico. 
Environmentalists have 
argued that the pipeline would 
increase carbon emissions by 
facilitating fuel production. 
But the agency concluded 
that approval or denial of 

the pipeline is “unlikely 

to significantly impact” 
production or consumption of 
the oil, which could travel by 
alternative routes. The White 
House must now decide 
whether to proceed with 

the project. 


Food foundation 
The sweeping US$956-billion 
‘Farm Bill, passed by the 

US House of Representatives 
on 29 January, includes some 
research support among its 
wider measures regulating 
food-stamp payments and 
farmer subsidies. The bill, 
formally known as the 


Butterfly migration hits historic low 


Fewer monarch butterflies migrated across 
North America in 2013 than in any previously 
recorded year, according to a report released 
on 29 January by conservation group the 
WWE Surveys of forested regions in Mexico, 
where monarch butterflies (Danaus plexippus) 
hibernate from November to March, found the 
creatures occupying only 0.67 hectares of land 
— a44% drop from the previous year, and the 


Agricultural Act of 2014, 
authorizes the creation of a 
non-profit corporation called 
the Foundation for Food and 
Agriculture Research. It calls 
for an initial $200 million for 
the foundation to support 
research in areas such as plant 
and animal health, nutrition 
and renewable energy. The 
Senate is expected to pass the 
bill this week. 


Reef waste dump 


The authority in charge of 
Australia’s Great Barrier Reef 
has approved a controversial 
dumping of dredging waste 
in the marine park around 
the immense coral edifice. 
The Great Barrier Reef 
Marine Park Authority says 
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that “strict environmental 
conditions” will be in place 
for the dumping of up to 

3 million cubic metres of 
spoil originating from the 
expansion of the Abbot Point 
coal port. But environmental 
groups say that the move will 
endanger the ecosystem and 
threatens the reef’s status as a 
UNESCO World Heritage Site. 


Stem-cell genomics 


Researchers from seven 
Californian institutions have 
bagged a US$40-million grant 
to establish a centre that will 
apply large-scale genetics 
studies to stem-cell research. 
The award from the California 


smallest area since surveys began in 1993. Area 
occupied is used as an indicator of population 
size. Changes in land use and extreme climate 
conditions along the roughly 4,000-kilometre 
migration route from Canada to Mexico have 
contributed to the decline, as has deforestation 
of hibernation sites, says the WWE. Use 

of agricultural herbicides has reduced the 
availability of milkweed, a key food source. 


Institute for Regenerative 
Medicine in San Francisco, 
announced on 29 January, will 
support a Center of Excellence 
in Stem Cell Genomics led 
jointly by Stanford University 
in Palo Alto and the Salk 
Institute for Biological Studies 
in San Diego. The selection 
process raised protests 

from other applicants, who 
questioned departures from 
review procedures used in 
previous grant cycles. 


Cancer genetics 

The US National Cancer 
Institute in Bethesda, 
Maryland, has launched one of 
the first trials to assess whether 
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SOURCE: IARC WORLD CANCER REPORT 2014 


cancer treatments that are 
tailored to individual genetic 
profiles are more beneficial 
for patients than non-targeted 
treatments. The Molecular 
Profiling based Assignment 
of Cancer Therapeutics 
(M-PACT) study, announced 
on 30 January, will screen 
tumours from 180 patients 
for mutations in 20 genes that 
are known to affect treatment. 
Half of the patients will 

then receive therapy that is 
customized for their specific 
mutations, and half will receive 
non-customized therapy. The 
findings are expected to be 
reported in 2017. 


Lab-animal reforms 
Responding to criticisms, 
Imperial College London on 
31 January unveiled a plan 

for “wholesale reform” of the 
ethical review and governance 
of its animal research. Last 
year, the university underwent 
an independent review after 
an undercover investigation by 
anti-vivisectionists produced 
allegations of malpractice 

(see Nature http://doi.org/ 

rbd; 2013). See go.nature. 
com/7vqf2t for more. 


PEOPLE 


Science leader 
Chemist Geraldine Richmond 
(pictured) has been chosen 

as the next president-elect of 
the American Association for 
the Advancement of Science 


TREND WATCH 


(AAAS) in Washington DC. 
Richmond, who is a professor 
at the University of Oregon in 
Eugene, studies the chemistry 
of surfaces and interfaces. 

She is a member of the US 
National Academy of Sciences 
and the founder and chair 

of the Committee on the 
Advancement of Women 
Chemists, an organization 
that supports female scientists 
and engineers. She will take 
over as AAAS president in 
February 2015. 


Neglected diseases 
Global health advocates 
expressed dismay last week 
over news that pharmaceutical 
giant AstraZeneca is ending 
research on treatments for 
tuberculosis, malaria and 
neglected tropical diseases. In 
2012 the company, which has 
its headquarters in London, 
joined a coalition to eradicate 
neglected tropical diseases, 


THE BURDEN OF CANCER 


which affect 1.4 billion people 
worldwide. Geneva-based 
advocacy group Médecins 
Sans Frontiéres called the 
latest move “discouraging” 
and highlighted the need 

to combat neglected diseases 
that afflict the world’s 

poorest people (see Nature 
505, 142: 2014). 


Data partners 


Pharmaceutical giant Johnson 
& Johnson announced on 

30 January a new partnership 
with Yale University in 

New Haven, Connecticut, 

to share data from the 
company’s clinical trials. The 
Yale University Open Data 
Access project will serve 

as an independent body to 
review and manage requests 
from researchers seeking 
anonymized clinical-trial data 
from the company, of New 
Brunswick, New Jersey. The 
move follows initiatives to 
promote clinical data sharing 
and transparency in the 
United States and in Europe 
(see Nature 505, 131; 2014). 


School suspension 


Educational company 
Coursera has blocked access 
to services for students 
from Cuba, Iran and Sudan. 
The firm, which is based in 
Mountain View, California, 
specializes in massive open 
online courses (see Nature 
495, 160-163; 2013). Citing 
US export regulations that 


Asia experienced around 46% of global new cancer cases in 


By 2025, there will be more than 
20 million new cancer cases per 
year, compared with 14.1 million 
in 2012, according to the World 
Cancer Report 2014, released on 
3 February by the World Health 
Organizations International 
Agency for Research on Cancer. 
Demographic changes and 
increased life expectancy are 
responsible, the report says. The 
greatest impact will be on low- 
and middle-income countries, 
noted Margaret Chan, director- 
general of the World Health 
Organization. 


2012, but 53% of cancer mortalities. 
INCIDENCE: 14.1 million new cases 
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SEVEN DAYS | THIS WEEK | 


6-9 FEBRUARY 

The Molecules and 
Materials for Artificial 
Photosynthesis 
conference in Canctin, 
Mexico, highlights 

the latest research in 
molecular catalysts, solar 
cells and nanomaterials 
for energy conversion 
and storage. 
go.nature.com/asfpi5 


12-15 FEBRUARY 
Researchers gather in 
Marco Island, Florida, 
for the annual Advances 
in Genome Biology and 
Technology meeting. 
Topics include prospects 
for next-generation 
sequencing in cancer 
treatment, and the use 
of genomic tools to 
study neural circuits. 
go.nature.com/dmkkmp 


restrict services to sanctioned 
nations, Coursera said 

on 28 January that it had 
begun blocking users from 
logging on to its website 

from IP addresses in affected 
countries. Access for students 
from Syria was initially 
revoked, but was reinstated 
after the company learned of a 
regulatory exception. 


Drilling deferred 

Oil company Royal Dutch 
Shell has shelved its 2014 
drilling programme off the 
coast of Alaska. Speaking to 
investors on 30 January, chief 
executive Ben van Beurden 
cited a 22 January court ruling 
that the US government 

did not properly assess the 
potential environmental 
impacts of offshore drilling in 
the Chukchi Sea (see Nature 
505, 590; 2014). “The lack ofa 
clear path forward means that 
Iam not prepared to commit 
further resources for drilling 
in Alaska in 2014,’ he said. 


> NATURE.COM 
For daily news updates see: 
www.nature.com/news 
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There is little evidence that bite marks on a crime victim’s skin allow reliable identification of the perpetrator. 


CRIMINOLOGY 


Faulty forensic 
science under fire 


US panels aim to set standards for crime labs. 


BY SARA REARDON 


prison in New Jersey wondering how 

forensics experts had got his case so 
wrong. His conviction for a 1994 murder was 
based on a bite mark on the victim's body that 
seemed to match his own teeth; it was the main 
physical evidence linking him to the crime. 
Last year, he was exonerated when DNA taken 
from the same bite mark turned out not to be 


oS 19 years, Gerard Richardson sat in 


his. According to the Innocence Project in 
New York, which tracks wrongful convictions, 
more than half of DNA exonerations involve 
faulty forensic evidence from crime labs and 
unreliable methods such as bite-mark analysis. 

Cases such as Richardson’s are one reason 
why the US Department of Justice and the 
National Institute of Standards and Technol- 
ogy (NIST) have now created the first US 
national commission on forensic science. 
The panel of 37 scientists, lawyers, forensics 


practitioners and law-enforcement officials 
met for the first time this week in Washington 
DC, and aims to advise on government policies 
such as training and certification standards. 
In March, NIST will begin to set up a parallel 
panel, a forensic-science standards board that 
will set specific standards for the methods used 
in crime labs. 

For many scientists, this hard look at foren- 
sic science comes none too soon. “The broad 
objective is to put the science into forensic 
science so it can legitimately have the name,” 
says commission member Stephen Fienberg, 
a statistician at Carnegie Mellon University in 
Pittsburgh, Pennsylvania. In 2009, the National 
Research Council (NRC) released a damn- 
ing report criticizing US forensics practices. 
According to the report, nearly every analyti- 
cal technique, from hair-sampling methods to 
those used in arson investigation, is unreliable, 
with too much variability in test results. Only 
DNA evidence escaped condemnation. 

In addition, the NRC was concerned about 
forensics lab training. In 2009, only 60% of 
publicly funded crime labs employed a certified 
examiner. And the report called for standards 
to ensure that all labs evaluate evidence in the 
same way. Very often, it said, two labs analysing 
evidence from a crime scene will come up with 
different results using the same method. 

The NRC offered a list of fixes, including the 
creation of a government agency with regula- 
tory power and a research budget. Much like 
the NRC, the commission is only an advisory 
body that will offer expert opinions. But by 
having the ear of the US Attorney General, who 
can order changes in federal-agency practices, 
the national commission could be influential, 
says John Butler, a forensic geneticist at NIST 
and the commission's vice-chair. The commis- 
sion will meet and produce recommendations 
until April 2015, although Butler says that its 
remit may be extended. 

The two panels’ recommendations will 
not directly affect practices in state and local 
labs, which handle more than 90% of foren- 
sics needs. But their visibility could cause rec- 
ommended standards to trickle down. If that 
does not work, the federal government could 
withhold grants to labs that do not conform to 
new standards, or limit access to federal DNA 
databases. 

Even in DNA collection, there are discrep- 
ancies between standard practices in federal, 
state and individual labs. The FBI, for > 
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> instance, records 13 specific base-pair loca- 
tions, or loci, from DNA samples in its national 
database, to ensure that false matches do not 
occur. But in 2008, the San Francisco Police 
Department in California used a 30-year-old, 
low-quality DNA sample from a murder case 
to convict a 70-year-old man who was listed 
in its state database — even though only five 
loci were matched. In a database the size of 
California’s, matching based on these five loci 
would identify an innocent person one-third 
of the time. 

Even good standards and best practices 
do not mean that a technique is solid, says 
Fienberg. Trained polygraph operators, for 
instance, can obtain consistent test results, but 
whether the machines accurately detect lies 
is highly uncertain. Many law-enforcement 
agencies still use the technique, even though 
a 2003 NRC report found it to be unreliable. 

“The fundamental issues with forensic sci- 
ence can be solved by fixing the science,” says 
Suzanne Bell, a forensic chemist at West Vir- 
ginia University in Morgantown. Bell says that 
the field needs more research funding. In 2012, 
the National Institute of Justice funded just 
US$5 million in basic forensic-science research. 

The value of certain techniques is often 
overstated in court cases, says Simon Cole, 
who studies the history of science in the 
criminal justice system at the University of 
California, Irvine. Fingerprint comparison, 
for instance, is often presented as an exact 
science, but researchers have only recently 
begun to study just how well people can do 
the matching. A 2011 study found that pro- 
fessional examiners matched two finger- 
prints incorrectly once in every 1,000 times, 
and missed a correct match 7.5% of the time 
(B. T. Ulery et al. Proc. Natl Acad. Sci. USA 
108, 7733-7738; 2011). Cole would like the 
standards board to define a ‘match’ precisely, 
and to assess the extent to which different 
methods yield different results. 

The standards board could also question 
how widely some of the more dubious tech- 
niques should be used. Mary Bush, a forensic 
dentist at the State University of New York in 
Buffalo, says that there is little evidence that 
bite marks left in skin can reliably identify per- 
petrators. In her lab, moulds of different sets of 
teeth were clamped into the skin of cadavers. 
Digital images of the marks were then ana- 
lysed. Often, the marks could not be used to 
identify the teeth responsible. 

Gregory Golden, president of the American 
Board of Forensic Odontology, argues that the 
method is useful for eliminating suspects or 
determining whether a bite mark is human. 

According to the Innocence Project, how- 
ever, at least 15 people whose convictions 
involved bite marks and who served time in 
prison have been exonerated through DNA 
evidence since 1993. That alone suggests that 
the method should be investigated, says Bush. 
“We're fighting 30 years of precedent.” m 
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Immigrants and visitors to the United Kingdom often face red tape and discouraging policies. 


IMMIGRATION 


UK visa problems 
worry scientists 


Immigration policies scare off foreign talent, warn critics. 


BY DANIEL CRESSEY 


r Vhe United Kingdom’s increasingly 
tough stance on immigration is driving 
foreign scientists to competing nations, 

the academic community has warned. 

At a meeting with the Home Office last 
month, representatives of leading universities 
and scientific organizations said that unwel- 
coming government rhetoric about reducing 
immigration, together with complicated visa 
procedures for visiting researchers, make Brit- 
ain an unattractive destination for scholars. 

The Campaign for Science and Engineer- 
ing (CaSE) in London, which promotes 
science-friendly policies and coordinated the 
meeting, is now actively lobbying the govern- 
ment to change its policies to avoid scaring 
away international students and academics. 

The House of Lords, the upper chamber of 

Parliament, has started an investigation. 

“The really big issue is the one of how the 
UK is perceived internationally and how 
attractive it seems to people who wish to come 
here,” says Sarah Main, director of CaSE. 

She adds that CaSE’s members, which 
include universities, scientific societies and 
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businesses, have expressed concern both 
about the complexity and bureaucracy of 
certain visa schemes, and more generally 
“about the welcome being offered to often very 
senior academics and professionals” when 

they try to come to the United Kingdom. 
In 2010, CaSE launched a campaign to keep 
the United Kingdom open to scientists (see 
Nature 468, 346; 2010). 


“We’re now Subsequently, the gov- 
viewed ernment altered vari- 
overseas ous rules; for example, 
as quite a it exempted employers 
potentially of PhD-level staff from 
unwelcoming a requirement to prefer 
place to be.” candidates who already 
have UK residency. 


In addition, a whole new visa type — the 
‘exceptional talent route’ — was launched to 
attract skilled migrants (see Nature 476, 243; 
2011). This created up to 700 places per year 
for scientists to enter the United Kingdom, if 
they are endorsed by the Royal Society, the 
Royal Academy of Engineering or the Brit- 
ish Academy. There are also 300 places for 
people working in the arts. 

But the success of these moves has been 
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mixed, and in recent months the government 
has been taking a harder line on immigration. 
It now wants to reduce net migration to the 
United Kingdom from 182,000 people in 2012 
to tens of thousands per year by 2015. 

Susan Kay, executive director of the Engineer- 
ing Professors’ Council in Horsham, says that 
scientists complaining about the immigration 
system “have been getting louder”. “We're now 
viewed overseas as quite a potentially unwel- 
coming place to be for academics,’ she says. 

The visa system “needs to be simpler, it needs 
to be more accessible’, says Kay, who was at the 
CaSE meeting and was “very much encouraged” 
by the Home Office's receptiveness. In a state- 
ment, immigration minister Mark Harper said 
that the government was building a system that 
“supports growth by curbing abuse, while still 
welcoming the brightest and the best”. 

However, the Home Office may argue that 
the scientific community has not exploited the 
concessions that it was granted in 2010. The 
exceptional-talent system has been hugely 
undersubscribed, with only 89 applicants in 
total entering the United Kingdom through 
this route between its launch in 2011 and June 
2013. The reasons for this are unclear. 

“It's a shame. It means universities are miss- 
ing out on exceptionally talented people who 
could be using that route. Second, they've lost 
the argument with the Home Office if they don't 
make this work,’ says Ian Robinson, a senior 
manager at immigration law firm Fragomen in 
London. He helped to design the exceptional- 
talent route while working at the Home Office. 


COMPLEX SYSTEM 

Robinson says that the UK system has many 
advantages. It is based on checking applicant- 
supplied evidence against set criteria, so it 
provides certainty that can be lacking in more 
subjective methods, such as the US and Aus- 
tralian interview systems (although US immi- 
gration rules are currently being reformed; 
see ‘Principles for compromise’). It is also 
fast — according to figures from Fragomen, a 
work-visa application to the United Kingdom 
is processed in an average of 15 days or fewer, 
versus 46-60 days for France and Germany, 
and 76 or more for Spain and Italy. 

Overall, says Robinson, concerns are 
“largely down to perception. Generally the 
system does work. Myth-busting is needed — 
the Home Office and the scientific community 


> 
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US IMMIGRATION REFORMS 


Principles for compromise 


In a move that may aid scientists who 

want to live and work in the United States, 
Republican party leaders in the House of 
Representatives released a set of principles 
for immigration reform on 30 January. 

It marks a tentative sign that they might 
consider compromising with broader efforts 
passed by the Democratic-controlled Senate. 

Under current policy, the thousands 
of scientists and engineers who seek 
permanent residence in the United States 
must compete for a total of 140,000 ‘green 
cards’ each year. Per-country limits on those 
permits often leave applicants from China, 
India and other oversubscribed countries 
waiting years for approval. 

A Senate bill passed last June would 
relieve the green-card backlog by 
eliminating country-based caps (see Nature 
499, 17-18; 2013). The measure would 
also create an unlimited number of green 
cards for immigrants with master’s or 
doctoral degrees in science, technology, 
engineering or mathematics (STEM) 
subjects from US universities. 

But the House, controlled by Republicans, 


need to go out there and say this is the system 
and this is how you use it” 

Student immigration is also a source of con- 
cern, with statistics released last month show- 
ing that the number of students coming to the 
United Kingdom from outside the European 
Union fell from 302,680 in 2011-12 to 299,970 
in 2012-13 — the first recorded drop ever. 
Rules on student visas have been toughened 
up in the past few years; for example, in 2011 
restrictions were placed on graduates who stay 
on to work after they complete their studies. 

A House of Lords committee is this week 
holding the first evidence session of an inquiry 
into whether visa problems are deterring sci- 
ence and engineering students. John Krebs, 
who heads the committee, told Nature that 
there is “ongoing concern” about the issue. 

Another concern is the problems faced by sci- 
entists seeking to visit the country to attend con- 
ferences or give talks, who fall into the ‘academic 


baulked at the sweeping plan, which included 
a controversial path to citizenship for illegal 
immigrants. It chose to pursue smaller 
immigration proposals. One measure, in the 
works since last year, agrees with scrapping 
country limits on green cards, but would give 
55,000 spots to holders of advanced STEM 
degrees from US universities. 

For any of the bills to become law, the two 
parties must overcome deep divisions on 
issues that have nothing to do with visas for 
scientists. The principles floated last week 
include options for granting limited legal 
status to illegal immigrants — a significant 
step towards the Democrats’ position, and 
a sign that negotiations on immigration 
reform could restart. 

But immigration reform has eluded 
lawmakers for years, and few observers 
are holding their breath. “It’s hard for me 
to get too excited right now, until we start 
seeing people start to come out and say 
more,” says Benjamin Corb, director of 
public affairs for the American Society for 
Biochemistry and Molecular Biology in 
Rockville, Maryland. Helen Shen 


visitor’ visa category. In January to September 
last year, the UK Border Agency received 4,770 
applications for such visas, and rejected 625 
(13%). The grounds for rejections are not pub- 
lished, but the rejection rate has stayed roughly 
consistent for the past six years. 

Denis Noble, president of the International 
Union of Physiological Sciences and a sys- 
tems biologist at the University of Oxford, was 
involved in organizing last year’s union meet- 
ing in Birmingham, which attracted around 
3,200 delegates. But in the weeks before the 
conference, he says, he spent nearly all of his 
time dealing with visa problems experienced 
by around 40 people who wanted to attend. 

“From the experience I’ve had, there is an 
image problem in the United Kingdom. It’s 
important that the right people try to put that 
right,” he says. m SEE EDITORIAL P.6 


Additional reporting by Richard Van Noorden 
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Some contributors to citizen-science initiatives, such as Project Noah, sport a tattoo of the project’s logo. 


CITIZEN SCIENCE 


Computer sharing 
loses momentum 


Competition and education needed to keep people engaged. 


BY NICOLA JONES 


he family of “‘@home’ volunteer 

computing projects is growing ever 

more diverse. Spare time on a personal 
computer can now be donated to anything 
from finding alien life to crunching climate 
models or processing photos of asteroids. But 
enthusiasm is waning. The 47 projects hosted 
on BOINC, the most popular software system 
for @home efforts, have 245,000 active users 
among their 2.7 million registrants, down from 
a peak of about 350,000 active users in 2008 
(see ‘Slumping @home). 

David Anderson, the founder of BOINC 
(Berkeley Open Infrastructure for Network 
Computing) and a computer scientist at the 
University of California, Berkeley, has several 
explanations for the slip. He says media cover- 
age has declined now that volunteer computing 
is more than 15 years old. A shift to mobile- 
computing devices has probably also hurt — 
BOINC can run on an Android phone while 
charging, but uses too much battery power 
when unplugged. And the site has been unable 
to attract a broad demographic of volunteers. 

“Essentially, we have a bunch of middle- 
aged, male computer nerds,” says Anderson. 
“We have thought long and hard about ways 
to break out of that category, using Facebook, 


for example, but none of that has been all that 
successful.” 

On 20-22 February, at the 3rd Citizen 
Cyberscience Summit in London, conference- 
goers will trade tips on how to entice volun- 
teers into projects ranging from BOINC-style 
distributed computing to more-active ‘citizen- 
science’ projects, in which users are asked to 
donate not just their time but also their brains. 

The desire to keep numbers up is not just 
academic. If distributed computing flourishes, 


SLUMPING @HOME 


The past several years have seen a decline in the 
number of active users in the BOINC family of 
volunteer computing projects. 
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serious money can be saved, says Francois 
Grey, coordinator of the Citizen Cyberscience 
Centre, based in Geneva, Switzerland. He 
notes that the Chinese Academy of Sciences 
in Beijing has been monitoring the economic 
benefits of CAS@home, which uses volunteers’ 
computing time for projects such as predicting 
protein structures. The academy estimates that 
US$20 million has been saved since it launched 
CAS@home in September 2010, by using 
donated computing power rather than buying 
it from a company such as Amazon. 

Grey predicts that funding bodies might at 
some point enforce the use of volunteer com- 
puting whenever possible, rather than allow- 
ing grant money to be used for supercomputer 
time or cloud-based services. “It’s very delicate. 
There are big IT companies with vested inter- 
ests in selling supercomputers to universities,” 
he says. “But I think it’s something that will 
happen at some point” 

For volunteer computing to be used in a big- 
ger way, participation rates need to keep up. 
Perhaps the most obvious motivator — money 
— is deemed a bad idea. “Small amounts of 
money are too trivial, and may be almost 
insulting,” says Grey. “It goes against the idea 
of volunteering” Only one BOINC project — 
IBM’s World Community Grid, an umbrella 
initiative that oversees a batch of biomedical 
projects aimed at goals such as drug discov- 
ery — has partnered with a scheme that allows 
volunteers to earn virtual cash (which can be 
exchanged for real money) for their time. This 
had a measurable but small overall impact, says 
Anderson, earning the grid as many as 15,000 
new volunteers, bringing the total so far up to 
almost 650,000. 

A more powerful motivator is pleasure. This 
can be achieved by turning participation into 
a game. FoldIt, for example, asks volunteers to 
optimize protein folding, which requires a mix 
of intellect and intuition that some describe as 
similar to chess. Competition can also provide 
pleasure. Many projects offer scoreboards and 
awards such as virtual titles or badges to mark 
progress; some people have become so devoted 
that they have had the badges tattooed on their 
bodies. In the BOINC world, groups of vol- 
unteers have formed teams that compete to 
donate the most time over a designated period. 
These competitions offer a short-term boost, 
but the effect wears off, says Anderson. 

Engaging participants in the core science 
mission is by far the best motivator, says Oded 
Nov, who studies links between new technolo- 
gies and human behaviour at New York Uni- 
versity. That includes giving participants credit 
in scientific papers and showing them how 
their help is advancing research. The World 
Community Grid, for example, hosts regu- 
lar Q&A sessions with its project scientists. 
“Education is a great motivator,” says Nov. 

That could be one reason why the Zooniverse 
— the largest host of citizen-science schemes — 
has not seen a decline in participation. Its family 


PROJECT NOAH/DAN DOUCETTE 
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of 22 projects asks volunteers to do everything 
from identifying galaxy types in astronomi- 
cal images to transcribing historical weather 
records. Robert Simpson, a developer and head 
of communications for the Zooniverse team, 
says that the five-year-old scheme has 930,000 
registered participants and that there is fairly 


consistent interest in new projects. 
Quantifying the effects of different moti- 
vational tools is difficult, says Grey, whose 
cyberscience centre has received funding to 
explore the possible benefits of common rules 
and credit schemes across different platforms 
such as BOINC and the Zooniverse. “Because 
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ofits grass-roots nature, everyone's doing their 
own thing; there's no common metric,” he says. 

One thing is certain: there is still plenty of 
spare brainpower to access. “US citizens alone 
spend 200 billion hours watching television a 
year, says Simpson. “We only need to tap a tiny 
fraction of that.” m 


Elsevier opens its 
papers to text-mining 


Researchers welcome easier access for harvesting content, 


but some spurn tight controls. 


BY RICHARD VAN NOORDEN 


cademics: prepare your computers for 
Aznnte Publishing giant Elsevier 

says that it has now made it easy for 
scientists to extract facts and data computa- 
tionally from its more than 11 million online 
research papers. Other publishers are likely to 
follow suit this year, lowering barriers to the 
computer-based research technique. But some 
scientists object that even as publishers roll out 
improved technical infrastructure and allow 
greater access, they are exerting tight legal con- 
trols over the way text-mining is done. 

A few years ago, scientists complained that 
publishers were stymieing ambitious plans to 
use computer software to pull out information 
from published papers. Some researchers who 
ran software to harvest data from online articles 
found their programs blocked, and those who 
asked for permission found themselves trapped 
in tortuous case-by-case negotiations — even 
though they had already paid subscription fees 
for access. Max Haeussler, a computational biol- 
ogist at the University of California, Santa Cruz, 
for instance, spent more than three years argu- 
ing with publishers for permission to extract 
DNA data from 3 million articles to annotate an 
online map of the human genome (see Nature 
483, 134-135; 2012). 

“It was a legitimate criticism, that people 
sent text-mining requests in to publishers and 
they bounced around for a time without any 
response,’ admits Chris Shillum, vice-presi- 
dent of product management for platform and 
content at Elsevier. The publisher previously 
considered requests “case by case’, he says — 
but it now wants to make text-mining permis- 
sions quicker and easier to obtain. “What we've 
tried to do is take the practical barriers away.’ 

Under the arrangements, announced on 
26 January at the American Library Association 


conference in Philadelphia, Pennsylvania, 
researchers at academic institutions can use 
Elsevier’s online interface (API) to batch- 
download documents in computer-readable 
XML format. Elsevier has chosen to provi- 
sionally limit researchers to 10,000 articles per 
week. These can be freely mined — so long 
as the researchers, or their institutions, sign a 
legal agreement. The 


deal includes condi- “Finally, 

tions: for instance, someone is 

that researchersmay showing that 
publish the products thereisnoneed 
of their text-mining tobe afraid of 


work only under a 
licence that restricts 
use to non-commer- 
cial purposes, can include only snippets (of up 
to 200 characters) of the original text, and must 
include links to original content. 

“Finally, someone is showing that there is no 
need to be afraid of text-mining analysis any 
more,’ says Haeussler. 

Researchers working on the Human Brain 
Project — a European consortium that plans 
to use a supercomputer to recreate everything 
known about the human brain — have already 
used Elsevier's interface to do text-mining, says 
the project’s spokesman Richard Walker, who 
is based at the Swiss Federal Institute of Tech- 
nology in Lausanne. “We are very pleased with 
it. It resolves genuine technical issues,” he says. 

And neuroscientist Shreejoy Tripathy at the 
University of British Columbia in Vancouver, 
Canada, worked with Elsevier last year to pull 
out information on neuron physiology from 
thousands of articles (see neuroelectro.org). 
Text-mining is not yet well known, he says, 
but he hopes that the easier access will kick 
off its greater adoption among scientists. 
“As more papers get published that use text- 
mining, other researchers like myself — who 


text-mining 
analysis.” 


are neuroscientists and not programmers — 
will see the need for the technique,’ he says. 

Shillum says that Elsevier is ahead of the 
curve — but that other publishers are likely to 
follow soon. CrossRef, a non-profit collabora- 
tion of thousands of scholarly publishers, will 
in the next few months launch a service that 
lets researchers agree to standard text-mining 
terms and conditions by clicking a button on a 
publisher's website, a ‘one-click’ solution similar 
to Elsevier's set-up. 

And, in the past year, large institutions and 
pharmaceutical companies have started to ask 
for text- and data-mining rights when renego- 
tiating site licences, says Jessica Rutt, rights and 
licensing manager at Nature Publishing Group 
(NPG), the publisher of this journal. Anyone 
with those rights may mine NPG content. 
Many publishers are also experimenting with 
delivering text-minable content to pharmaceu- 
tical companies for an extra fee, she adds. 

But some researchers feel that a dangerous 
precedent is being set. They argue that pub- 
lishers wrongly characterize text-mining as an 
activity that requires extra rights to be granted 
by licence from a copyright holder, and they 
feel that computational reading should require 
no more permission than human reading. 
“The right to read is the right to mine,’ says 
Ross Mounce of the University of Bath, UK, 
who is using content-mining to construct 
maps of species’ evolutionary relationships. 

National governments are also weighing in 
on the issue. The UK government aims this 
April to make text-mining for non-commer- 
cial purposes exempt from copyright, allowing 
academics to mine any content they have paid 
for. And the European Commission, worried 
that barriers to computational research could 
hinder scientific innovation, is also examin- 
ing the issue. It has convened a group chaired 
by Ian Hargreaves, an intellectual-property 
specialist at Cardiff University, UK, who rec- 
ommended the changes to UK law, to examine 
the economic impact of text- and data-mining 
for scientific research and barriers to its use. 
The panel will reach conclusions by the end 
of February. 

“Our plan is just to wait for the copyright 
exemption to come into lawin the United King- 
dom so we can do our own content-mining 
our own way, on our own platform, with our 
own tools,” says Mounce. “Our project plans 
to mine Elsevier’s content, but we neither 
want nor need the restricted service they are 
announcing here.” m 
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The Domaine de Vassal vine collection near Montpellier holds 2,300 different grape varieties. 


VITICULTURE 


Grapevine gene 
bank under threat 


Scientists raise concerns about relocation of premier French 
research vineyard dubbed the ‘Louvre of vines’. 


BY DECLAN BUTLER 


ncertainty hangs over one of the 

| world’s largest and most important 

grapevine collections. The Domaine 

de Vassal vineyard, on France's Mediterranean 

coast, houses a vast sweep of grape biodiversity 

that is essential to research and winegrowers in 
France and around the world. 

The 138-year-old collection, managed by 
the French National Institute for Agricultural 
Research (INRA), has been threatened with 
eviction, prompting a decision to relocate it. 

That is raising concerns among scientists 
and winegrowers, because money to pay for 
the prospective move — costing an estimated 
€4 million (US$5.4 million) — has yet to be 
found. Even then, the sheer logistical complex- 
ity is such that relocation is likely to take years 
to complete, says INRA, and means that much 
of its research may be put on hold. 

Dubbed the ‘Louvre of grapevines’ by the 
local press, the vineyard near Marseillan, 
southwest of Montpellier, contains thousands 
of unique grape varieties. As well as having a 
conservation role in preserving genetic diver- 
sity, the collection is used for research and 
for breeding qualities such as flavour, colour, 
adaptation to specific regions and pathogen 
resistance. Several hundred samples from the 


Domaine de Vassal are used annually, mainly 
by other French labs, but also internationally. 

“The collection is of utmost value to the 
international grapevine genetics community, 
says Carole Meredith, an emeritus geneticist at 
the University of California, Davis. “Although 
many countries have established collections of 
their own heritage grape varieties, the Vassal 
collection is among the oldest and best curated” 

Meredith notes that much of her own 
research would have been “impossible” without 
this “living library”. Her lab’s previous studies of 
the vineyard’s specimens revealed Chardonnay’s 
somewhat undistinguished heritage — one ofits 
parent varieties is a noble Pinot, but the other 
is a Gouais, a grape long shunned as mediocre 
(J. Bowers et al. Science 285, 1562-1565; 1999). 

The collection was started in 1876 by French 
researchers in response to a pest outbreak that 
saw the near-destruction of Europe's vineyards. 
The outbreak was caused by accidental intro- 
duction of phylloxera — an aphid that infests 
roots and kills the vine. 

The vineyard was initially located near Mont- 
pellier, but moved to the Domaine de Vassal in 
1949, where it expanded greatly. It now houses 
some 7,500 accessions from 47 countries, repre- 
senting 2,300 different grape varieties, including 
wild species, rootstocks, hybrids and mutants. 

But negotiations with the vineyard’s 
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landowner, wine company Domaines Listel in 
Séte, near Montpellier, have broken down over 
the renewal of the 30-year lease on the 27-hec- 
tare site. In 2011, Domaines Listel issued an 
eviction notice; in 2012, INRA took the dispute 
to the agricultural land tribunal in Béziers, 
which is scheduled to hear the case this June. 

Yves Barsalou, president of Domaines Listel, 
says that the company remains “open to all dis- 
cussions” to find a solution that allows INRA 
to remain at the nearshore site. 

In December, INRA announced its inten- 
tion to relocate the collection, probably to a site 
alongside Pech Rouge, an INRA viticulture and 
oenology research station in Gruissan, about 
70 kilometres southwest of Domaine de Vassal. 

Olivier Le Gall, INRA’ deputy director-gen- 
eral in charge of scientific affairs, says that the 
agency is “extremely committed” to preserving 
the collection, and is likely to have to find most 
of the moving costs itself. Other possible fund- 
ing sources, he says, may include the French 
Vine and Wine Institute in Grau de Roi, which 
does applied viticulture and wine research. 

The relocation, which should get under 
way this year, will be technically complex, says 
Jean-Michel Boursiquot, a vine taxonomist at 
Domaine de Vassal. Many specimens were col- 
lected as urgent rescue cases and carry diseases, 
but are protected from full-blown infections at 
the Domaine de Vassal because they are grown 
in beach sand. The sand shields against root 
infestations of phylloxera and nematode worms 
that can spread devastating viral vine diseases. 

INRA has decided against a similar nearshore 
location for the vineyard, fearing that rising sea 
levels caused by climate change would make the 
site vulnerable to high salinity and flooding, says 
Boursiquot. At Pech Rouge, the plants will grow 
on higher ground in limestone soils. This will 
leave diseased plants susceptible to root infesta- 
tions, so INRA intends to render the collection 
disease-free, a laborious process that involves 
repeated culturing and then propagating each 
plant until it is without pathogens. “Tt’s an enor- 
mous job, which to our knowledge has never 
been done on such a scale,’ says Boursiquot. He 
thinks that this cleaning process — equating to 
half the move costs — will take 5-10 years. 

Mark Thomas, a grapevine researcher at 
the Commonwealth Scientific and Industrial 
Research Organisation’s Waite campus in 
Urrbrae, Australia, says that the Domaine de 
Vassal is one of the few grapevine germplasm 
collections to have been extensively character- 
ized genetically, using DNA fingerprinting. This 
makes it an international reference source, and 
allows researchers to explore the genetic rela- 
tionships between varieties, and their origins. 

“This foundation of information is of great 
use for those around the world seeking to 
breed improved grape varieties,” adds Bruce 
Reisch, who develops such new strains at Cor- 
nell University’s research station in Geneva, 
New York. “It’s extremely important that this 
collection be preserved well into the future.” m 
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Phosphorene excites 
materials scientists 


Physicists look past graphene for atom-thick layers that 


could be switches in circuits. 


BY EUGENIE SAMUEL REICH 


raphene, a one-atom-thick layer of 
(G20 has charmed materials sci- 

entists with its enticing electrical 
properties that allow electrons to flow freely 
across its surface. But the material lacks a natu- 
ral band gap — a range of energy states in which 
electrons cannot exist freely — that could be 
used to switch this flow on and off. This reduces 
graphene’s usefulness as a replacement for the 
semiconductor switches in computer circuits. 

Last month, research groups in the United 
States and China reported’” on work towards 
a promising candidate that could fulfil both 
needs: phosphorene, an atom-thick layer of 
the element phosphorus that does have a natu- 
ral band gap. The work is part of a trend that 
David Tomanek, a condensed-matter theorist 
at Michigan State University in East Lan- 
sing, dubs the “post- 
graphene age” — in 
which researchers are 
exploring alternatives 
in the hope of over- 
coming graphene’s deficiencies. The rationale 
is that phosphorene might be useful for mak- 
ing thin, flexible electronics that could be more 
easily cooled than silicon ones. 

Physicists have been studying black phospho- 
rus — a layered material held together by weak 
chemical bonds — since the 1960s. But it was 
only last year that they began trying to isolate 
single layers. Just as in graphene, phosphorene 
atoms are arranged hexagonally, but in phos- 
phorene the surface is slightly puckered. With 
its band gap, phosphorene can be switched 
between insulating and conducting states, and 
it is still flat enough to confine electrons so that 
charge flows quickly, leading toa relatively high 
mobility that is prized by electrical engineers. 

Two groups, one’ led by Peide Ye of Purdue 
University in West Lafayette, Indiana, and the 
other’ by Yuanbo Zhang of Fudan University 
in Shanghai and Xian Hui Chen of the Uni- 
versity of Science and Technology of China in 
Hefei, posted reports on a preprint server in 
January. They reported that they had stripped 
black phosphorus to two or three atomic lay- 
ers by using sticky tape to peel the layers off a 
larger sample — the same method used in 2004 
to isolate layers of graphene. Neither team has 


“There is quite 
alot of hypein 
this area.” 


yet isolated a single layer of phosphorene. 

There are reasons for optimism, however. 
Already, the groups have reported charge flows 
at speeds comparable with those in single layers 
of molybdenum disulphide, a semiconductor 
material with a band gap that has been tinkered 
with for nearly two decades. And phosphorene, 
unlike molybdenum disulphide, is made from 
a single element, so pure samples are, in theory, 
easier to obtain. 

Phosphorene shares this purity with other 
post-graphene contenders such as silicene, 
made from silicon, and germanene, made from 
germanium. Although both of these are pre- 
dicted to facilitate speedier charge flows than 
phosphorene, neither has a natural band gap. 
Both needs could be met by yet another mat- 
erial: stanene, a single layer of tin predicted by 
theorists’ in 2013 that has not yet been created. 

A problem for all of these materials is their 
instability, because single layers can react 
with the air. Silicene, a favourite among post- 
graphene researchers, was stabilized in 2012, 
but it is still hard to prevent electrical interfer- 
ence from the metallic substrates it has to be 
grown on, says Patrick Vogt, a physicist at the 
Technical University of Berlin — so the applica- 
tions imagined for silicene are a way off. “There 
is quite a lot of hype in this area,” he says. 

Phosphorene seems more stable than its 
competitors, but it is not easy to produce: 
making black phosphorus entails putting the 
raw, powdered element under extreme pres- 
sure. Phaedon Avouris, a chemical physicist 
at IBM’s Thomas J. Watson Research Center 
in Yorktown Heights, New York, says that the 
latest results justify more study, but he suspects 
that phosphorene’s success in electronics will 
depend on whether researchers can find effi- 
cient ways to extract single layers and deposit 
them on substrates. 

Sébastien Francoeur, a physicist at the Poly- 
technic Institute of Montreal in Canada, has 
already been seduced. He began working with 
black phosphorus after seeing the latest results. 
“A two-dimensional material that is a semicon- 
ductor is interesting technologically,” he says. m 


1. Liu, H. et al. Preprint at http://arxiv.org/ 
abs/1401.4133 (2014). 

2. Li, L. etal. Preprint at http://arxiv.org/ 
abs/1401.4117 (2014). 

3. Xu, Y. Phys. Rev. Lett. 111, 136804 (2013). 
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WINTER SPORTS FACE 
AN UNCERTAIN FUTURE 
AS THE PLANET WARMS. 


BY LAUREN MORELLO 


SHORTER WINTERS 


Snow extent in the 
Northern Hemisphere has 
increased slightly in 
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autumn and winter over 
recent decades, but has 
dropped substantially in 
spring, resulting ina 
shorter snow season. 
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kiers, snowboarders and other 
athletes got a bit of a shock when 
they arrived in Sochi, Russia, for the 
22nd Olympic Winter Games. On 
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the way into town from the airport, 
competitors passed rows of palm trees that 
thrive in the breezes blowing off the Black 
Sea. Forty kilometres away, on the ski slopes 


THE DWINDLING 
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2080 low emissions 
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of Rosa Khutor, Sochi organizers have spent 
a year stockpiling manufactured snow as a 
hedge against the region's mild climate. 

Meteorologists may scoff at the decision 
to hold a winter sporting event in a city 
where February averages a balmy 6 °C and 
temperatures hover just above freezing 
in the nearby mountains. But Sochi and 
its massive snow-making operation offer 
a glimpse of the future of skiing and the 
pressures that will confront Olympic plan- 
ners as the world heats up. 

“We know things are going to get warmer, 
and eventually, when you have temperatures 
above freezing more commonly than not, 

> . » ‘ Peak elevation 
youre going to see less snow,’ says David 3.418 m 
Robinson, a hydroclimatologist who runs 
the Global Snow Lab at Rutgers University 
in Piscataway, New Jersey. In the Northern 


OLYMPICS 


If greenhouse-gas pollution 
increases slowly, climate 
forecasts suggest, only 10 of 
the 19 previous Winter 
Olympic sites will have a 
high probability of having 
enough snow and low 
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in the 2080s. Projections 


suggest that if emissions 
climb quickly, only six 
former sites will be suitable. 
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Hemisphere, the snow season has shrunk 
by about three weeks since the early 1970s 
(see ‘Shorter winters ), and snow cover is Sh 
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ROCKY SLOPES 


Colorado's Aspen resort is one 
of the most famous skiing areas 
in North America, but its 
long-term future is uncertain. 


Projections in one study suggest 


that by 2030 the rising winter 
snowline will near the base of 
the ski lifts. By 2100, the winter 
snowpack is forecast to cover 
only the top of the mountain. 
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TROUBLE IN THE ALPS ial 


If emissions of greenhouse 
gases continue to increase 
quickly, climate simulations 
project that many Swiss resorts 
will see a sharp reduction in the 
number of days with enough 
natural snow for skiing. 
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SNOW DOWN UNDER 


Forecasts using a mid-range emissions 
scenario suggest that high-elevation skiing 
areas in New Zealand could see deeper snow 
in the 2040s. But by the 2090s, the number 
of snow days and the depth of the snowpack 
are projected to fall substantially. 


Current Average Average 
modelled of models of models 
snow for 2040s for 2090s 


SOURCES: SHORTER WINTERS: RUTGERS UNIV. GLOBAL SNOW LAB; 
THE DWINDLING OLYMPICS: D. SCOTT ET AL. THE FUTURE OF THE 
WINTER OLYMPICS IN A WARMING WORLD (UNIV. WATERLOO, 2014); 
TROUBLE IN THE ALPS: M. BENISTON WIRES CLIM. CHANGE 3, 349-358 
(2012); ROCKY SLOPES: B. LAZAR & M. WILLIAMS PROC. WHISTLER 
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Snow has been stockpiled for the Olympic Winter Games in Sochi, Russia, for a year. 


> projected to decline substantially by the end 
of the century, according to a report released in 
September by the Intergovernmental Panel on 
Climate Change. 

There is considerable uncertainty in regional 
forecasts for individual skiing regions, but 
climate researchers say that models agree 
on broad patterns. “By the second half of the 
twenty-first century, the models suggest we'll 
be seeing really big changes in temperature 
and precipitation,” says Martin Beniston, a cli- 
mate physicist at the University of Geneva in 
Switzerland. Already, signs of an unwelcome 
thaw have appeared at even the highest eleva- 
tions. This season, the Verbier 4 Vallées resort 
in Switzerland eliminated two chair lifts after 
the lower edge of Tortin Glacier, at 2,800 metres 
elevation, receded by 40 metres in just 15 years. 

The outlook is not so gloomy everywhere, 
at least initially (see ‘Snow down under’). A 
warmer atmosphere can hold more moisture, 
so rising temperatures may actually increase 
snow at some high-elevation sites — such as 
the peaks of New Zealand and parts of the 
Swiss Alps — for several decades, until winter 
temperatures inch above freezing (see “Trouble 
in the Alps’). In fact, average snowfall over the 
past decade at the Verbier resort has outpaced 
that in each of the previous three decades. But 
the area has also had to deal with more vari- 
ability, warmer summers and a spate of hit-or- 
miss winters. “The extremes are much higher 
or much deeper,’ says Eric Balet, chief execu- 
tive of Téléverbier, the company that owns 
4 Vallées. That introduces an unwelcome ele- 
ment of unpredictability for resort managers. 

Skiing areas at low elevations face the 
worst forecasts. The US states of Connecticut 


and Massachusetts are home to a combined 
17 skiing areas, and a study suggests that by 
2039, none will sustain a viable skiing season 
— defined by industry as 100 days or more — 
even with artificial snow-making (J. Dawson 
and D. Scott Tourism Mgmt 35, 244-254; 2013). 
But at least 94% of the 18 resorts in the more 
northerly state of Vermont are projected to be 
viable until 2070 or beyond. A big difference 


is altitude: all the resorts in Connecticut and 
Massachusetts have a peak elevation below 
750 metres, whereas 16 of those in Vermont 
exceed that, many by hundreds of metres. 


QUALITY AND QUANTITY 

The prospect of losing small skiing areas 
worries Auden Schendler, vice-president for 
sustainability at Aspen Skiing Company, which 
runs the Aspen and Snowmass resorts high in 
central Colorado (see ‘Rocky slopes’). “We call 
those feeder resorts,” he says, because their 
lower prices and gentler slopes attract new ski- 
ers to the sport. “It doesnt serve us for other 
resorts to go out of business.” And small moun- 
tains can produce big stars. The 2010 Olympic 
downhill champion, Lindsey Vonn, started 
skiing at Buck Hill, a 364-metre mountain in 
Burnsville, Minnesota; reigning World Cup 
slalom champion Mikaela Shiffrin trained as a 
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child at New Hampshire's Storrs Hill Ski Area, 
where the peak elevation is just 177 metres. 

It is not only the quantity of snow that is cru- 
cial for winter sports such as skiing and snow- 
boarding. Quality matters too, says Anne Nolin, 
a snow hydrologist at Oregon State University 
in Corvallis. A warmer, moister atmosphere 
will produce heavier, wetter snow, not the dry, 
fluffy ‘champagne powder’ prized by many rec- 
reational skiers. Artificial snow created with 
snow-making cannons is often icy, perfect for 
laying the base of lightning-fast competition 
runs but less favourable for the average skier. 
And temperatures that skirt the freezing mark 
increase the risk that precipitation will fall as 
rain, not snow, and will raise the density of the 
snowpack. “These just aren't the kind of condi- 
tions that people go skiing for,’ says Nolin. 

It is not clear what the changing face of winter 
portends for future Olympic Games (see “The 
dwindling Olympics’). The competition has 
already been shaped by the vagaries of weather 
and climate, beginning with the decision dec- 
ades ago to move figure skating, speed skating, 
ice hockey and curling indoors, says Daniel 
Scott, a geographer at the University of Waterloo 
in Canada. His research suggests that the pool 
of locations capable of hosting the games will 
shrink as the climate warms — and the colder 
mountain cities that may be the best fit may 
not have the infrastructure to handle a massive 
influx of athletes, spectators and organizers. 
That will force some difficult decisions, he says. 
“Tt’s an interesting dilemma the International 
Olympic Committee will be caught in.” m 


Lauren Morello is the assistant US News 
editor at Nature. 
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THE CHANGING FACE OF 
PRIMATE RESEARCH 


A hard-won political victory for primate research 
is at risk of unravelling in pockets of Europe. 


BY ALISON ABBOTT 


he worst moment in neuroscientist 

Andreas Kreiter’s 16-year struggle to 

defend his research came when his wife 

arrived home after the birth of their sec- 

ond child. Waiting for her was an enve- 

lope containing a death threat against 
their three-year-old. 

Kreiter, who uses macaques in his studies of 
the brain at the University of Bremen in Ger- 
many, is a veteran of the fierce and periodically 
violent tactics of animal-rights activists. When 
protests peaked in the late 1990s, he lived 
under police protection — but he still contin- 
ued his research. “I had thought very carefully 
before deciding to work with primates,’ he 
says. ‘And I believe it is necessary if we are to 
understand the human brain” 

Later Kreiter found himself facing an unfa- 
miliar foe: local authorities looking to restrict 
primate research in their city. In 2008, Bremen 
officials declined to renew Kreiter’s licence to 
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work with macaques. The fate of his research 
has been in legal limbo ever since. 

Kreiter’s courtroom conflicts put him in 
good company. Across Europe, a particularly 
volatile patchwork of emerging local regula- 
tions threatens to distort the spirit of a recent 
European Union (EU) directive that explic- 
itly allows research on non-human primates. 
Although some researchers say they have never 
felt so secure, others are facing new obstacles as 
activists change tack, from bullying researchers 
to putting pressure on regional policy-makers. 

The problems continue even as the EU is 
pushing for the translation of basic research 
into therapies — a transition that often 
requires the testing of experimental therapies 
in primates. And opportunities for translational 
research are growing thanks to recent techno- 
logical breakthroughs. However, restrictions 
on primate experiments could hinder their 
development. 


ILLUSTRATION BY GARY NEILL 


Some European researchers are shifting 
their strategies, too, by talking more openly 
about their work with primates. But other 
scientists have simply stopped using monkeys 
altogether — or side-stepped the European 
quagmire by setting up controversial collabo- 
rations in other countries, particularly in Asia. 

“Primate researchers should always expect 
to be under pressure, because we are handling 
a valuable and sensitive resource,” says Roger 
Lemon of University College London, UK, 
who hopes his work on how the brain controls 
fine hand movements might lead to therapies 
for recovering function after a stroke. “But it's a 
sad irony that key developments may be trans- 
ferring to countries that don’t have the high 
level of animal welfare we have here.” 


STABILIZING STEP 

The pressures on primate researchers have 
taken many forms. In the United States, for 
example, commercial airlines have effectively 
ceased all primate shipments by air within the 
country, making it difficult for researchers to 
transport animals. Many airlines in Europe 
have taken similar steps, but Air France con- 
tinues to provide service. 

Not long ago, the EU seemed to take a step 
towards stabilizing the environment for pri- 
mate research. In September 2010, after more 
than a decade of anguished public debate, the 
EU adopted a directive governing the use of 
animals for research purposes. With its careful 
balance of animal-welfare and research needs, 
the directive seemed destined to ease tensions. 
Among other things, it established minimum 
welfare requirements for all animals, laid out 
definitions of pain intensity, and banned most 
research on great apes. It also included a hard- 
won clause — added at the last minute after 
intense lobbying by the biomedical commu- 
nity — explicitly permitting basic research 
on non-human primates, provided the work 
could not be carried out in any other species. 

EU member states were required to anchor 
the directive into national legislation by 1 Janu- 
ary 2013. And they were forbidden to ‘gold- 
plate’ the regulation by making national law 
stricter than EU law. 

But animal-rights activists have continued 
their fight. They have honed their activities 
for greater media attention and have delayed 
implementation of the directive in several 
countries. Animal-rights organizations now 
focus on policy-makers rather than scientists, 
says Robert Molenaar, campaign manager for 
the Coalition Against Animal Experiments 
(ADC), which operates in the Netherlands 
and Belgium. The ADC is concentrating first 
on monkey research in universities, he says, 
because it is an easy way to get press coverage 
and influence political opinion. 

The ADC is also forging international links 
and works closely with a sister organization 
in the United Kingdom, the Anti Vivisec- 
tion Coalition (AVC), headed by Luke Steele. 


Steele spent nine months in prison after being 
convicted in 2012 of harassing staff at Harlan 
Laboratories, a contract research company in 
Blackthorn, UK. The jail time was interest- 
ing, he says: he used it to reflect on strategies. 
“Researchers themselves tend to be traditional- 
ists who are not open to alternatives,” he says. 
“T realised we need to go for policy-makers.” 
The AVC and the ADC were the main driv- 


“YOU CAN'T GO DIRECTLY 
FROM MICE TO HUMANS. 
MICE ARE SIMPLY NOT A GOOD 
MODEL OF HOW PEOPLE SEE.” 


ers of the Stop Vivisection Initiative, a petition 
calling for the EU animal-research directive to 
be abrogated and animal research to be banned 
altogether. The petition, launched in November 
2012, collected more than a million signatures 
across the EU within a year. The signatures are 
now being verified; if they pass, the initiative 
will be granted hearings at the European Com- 
mission and the European Parliament. 

“This will reopen the debate — something 
we'd all rather do without, given the enormous 
effort that the commission, scientists and 
animal-welfare groups invested in achieving 
the compromise,’ says Stefan Treue, director 
of the German Primate Center in Gottingen 
and an adviser to the European Commission 
on the 2010 directive. 

Treue doubts that the Stop Vivisection 
campaign will change European legislation 
— political demand for new therapies is too 
strong, he says. But, like many of his col- 
leagues, he says that researchers working with 
monkeys should abandon their conventional 
tactic of keeping quiet, which cedes ground 
to the activists. Two months after the direc- 
tive was approved, Treue helped to launch the 
Basel Declaration (see Nature 468, 742; 2010), 
which commits its signatories — so far more 
than 2,500 — to be open about their animal 
research and to engage in public dialogue. 

The declaration prompted a sea change, 
and many initiatives are emerging in its wake. 
For example, the Swiss Primate Competence 
Center for Research was launched last year in 
Fribourg to provide a training centre for sci- 
entists and technicians wanting to work with 
primates, and an educational one-stop shop for 
the public. 

Individual scientists are also speaking up on 
their websites. Neuroscientist Pieter Roelfsema 
at the Netherlands Institute for Neuroscience 
in Amsterdam, who works with monkeys, says 
that so far activists have not targeted research- 
ers in his lab. But he fears this may soon change. 
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Last spring, minority parties in the Dutch 
parliament — including the Dutch Party for 
the Animals — posed formal questions about 
whether research using monkeys is necessary, if 
it could be replaced by alternative methods, and 
if the number of government-funded research 
institutes using monkeys could be reduced. 

With these developments in mind, Roelf- 
sema is planning a public-information webpage 
about the value of primate research, modelled 
on that of Nikos Logothetis, a director at the 
Max Planck Institute for Biological Cybernetics 
in Tubingen, Germany. Logothetis’s site, which 
has thousands of visitors a week, emerged from 
a public-relations debacle. In 2009, he invited a 
team of investigative journalists from a national 
television company into his lab, imagining 
that the reporters would be impressed by his 
monkeys’ luxurious accommodation, and sur- 
prised by how relaxed and content the animals 
seemed. Instead, the journalists portrayed a 
slightly mad scientist among suffering animals. 
The experience “spectacularly demonstrated 
the need for a reaction of scientific organiza- 
tions to the escalating absurdity of the anti- 
vivisectionists’, Logothetis says. 

However, Tiibingen — unlike Kreiter’s 
Bremen — isa city where researchers enjoy a 
supportive political environment. Even the city’s 
mayor, a member of the Green Party, which is 
not known for supporting animal experiments, 
has openly criticized flyers distributed by activ- 
ists as untruthful, and described the harsh treat- 
ment of Logothetis as “unacceptable”. 

“This shows the power of local politics to 
influence how easy or difficult it can be to 
carry out research using monkeys in different 
European regions,’ says Treue, whose research 
centre also benefits from local political support 
in Gottingen. For scientists such as Treue, the 
EU directive has brought a feeling of stability. 


THE ITALIAN JOB 

That feeling is largely absent in Italy. In 2012, 
activists attacked a beagle-breeding facility 
near Brescia. It was later closed down. In 2013, 
they sabotaged experiments at the University 
of Milan. And last month, activists posted fly- 
ers that included photographs, addresses and 
phone numbers of some of the university's 
researchers in their home neighbourhoods. 

By 2012, some populist politicians had 
adopted the animal-rights cause and used it 
to influence the Italian implementation of the 
EU directive. The proposed law went beyond 
the directive, calling for a ban on xenotrans- 
plantation and the use of animals in addiction 
research. 

Italian scientists woke up late to the threat, 
and by the time researchers had organized a 
petition defending animal research — signed by 
13,000 people in just a few weeks — the course 
of the distinctly gold-plated law was already set. 
It passed through parliament in December. 

Researchers who use monkeys are also wor- 
ried about ambiguities in how the Italian law 
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Animal-rights campaigners have switched from targeting scientists to ‘eating pressure on policy-makers. 


interprets the EU directive's clause allowing 
research on non-human primates. “It’s not 
clear at all whether basic research is allowed or 
not,” says neurophysiologist Roberto Caminiti 
at the University of Rome La Sapienza, who 
chairs the Committee on Animals in Research 
for the Federation of European Neuroscience 
Societies. 

The law also requires all research proposals 
involving non-human primates, cats or dogs 
to be authorized by the High Health Council 
(Consiglio Superiore di Sanita), the broad 
mandate of which includes drug licensing and 
approval of clinical protocols. This additional 
level of control, on top of the approval required 
from local ethical committees, would slow and 
destabilize the process, says Caminiti. 

The legislation is expected to become law 
in March. As soon as it does, Caminiti and 
his colleagues plan to file an appeal to the EU 
Court of Justice. “Gold-plating is not allowed? 
he says, “so we are confident of winning” In the 
meantime, Caminiti predicts that Italian labs 
working with primates will all be able to argue 
that their work has health benefits for humans. 

In Belgium, the government is hurrying 
through a similar gold-plating decree that 
would also ban the use of primates in addic- 
tion studies, and require a national commit- 
tee to approve projects involving non-human 
primates, even after approval by local ethics 
committees. The Belgian health minister 
would have the final say on whether a particu- 
lar project could go ahead, raising concerns 
that final decisions would be based on politics, 
rather than on science or ethics. 

Political decisions are already affecting 


research in Switzerland, a non-EU country 
that is not bound by the 2010 animal-rights 
directive. In 2000, Switzerland’s constitution 
was changed to protect the dignity of animals 
—amove that led courts to limit the use of 
monkeys to translational research. 

Researchers in Fribourg have been able to 
continue their studies of spinal-cord repair in 
primates, but local authorities in Zurich have 
not renewed licences for basic research using 
primates since 2004. Kevan Martin, a direc- 
tor at the city’s Institute of Neuroinformatics, 
had to stop mapping the functional microcir- 
cuitry of the macaque brain in 2006, when his 
licence expired. Martin was shocked to learn 
that local authorities had declined to renew his 
licence because the work was unlikely to reap 
practical benefits for society in the near term. 
He was even more shocked when his appeal to 
Switzerland’s supreme court was turned down. 
“Is any applied research possible without basic 
research?” he muses. 


WORKING ABROAD 

In this climate, some Swiss scientists are 
relying on their collaborations in other coun- 
tries to carry out primate experiments. Botond 
Roska of the Friedrich Miescher Institute for 
Biomedical Research in Basel and his colleagues 
have used mice to develop an experimental 
treatment for a common type of blindness 
called retinitis pigmentosa. The method is now 
poised for human trials, to be run by the small 
Paris-based biomedical company GenSight 
Biologics, which Roska co-founded. “But you 
cant go directly from mice to humans because 
you cant be sure if the neural circuits are the 
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same,’ says Roska. “Mice are simply not a good 
model of how people see.” 

Rather than face uncertainty in Switzerland, 
Roska and his collaborators — GenSight and 
the Vision Institute in Paris — are conducting 
primate studies in France, where animal activ- 
ists have less political support. Roska hopes 
the first human patient could be treated within 
the year. 

Like Roska, Per-Olof Berggren at the 
Karolinksa Institute in Stockholm has reached 
a translational turning point in his research. 
He has developed an experimental therapy for 
diabetes in mice, and now needs to test it in pri- 
mates before moving to humans. He thinks he 
could have got a licence for this in Sweden, but 
knew that he could not have afforded it. Regu- 
lations in the country, where animal-rights 
and animal-welfare groups are very powerful, 
require particularly large, sophisticated — and 
consequently expensive — primate facilities. So 
Berggren decided to do the work in Singapore, 
where he says facilities are first-class and ethical 
standards are as high as in Europe. “They havea 
long tradition of working with monkeys there, 
and it doesn’t cost so very much.” 

Berggren is far from alone: many European 
researchers are taking their primate research 
to Asia, sparking a controversy that is divid- 
ing the scientific community. Some worry that 
standards of ethical oversight and animal wel- 
fare could be lower in certain Asian countries. 
And Martin points out that the trend exacer- 
bates the loss of skills already apparent as the 
number of groups working on primates in 
Europe falls. (The number of primates used in 
the EU for scientific purposes shrank by more 
than 25% between 2008 and 2011, according to 
the European Commission.) “The loss is going 
to be much harder to reverse,” he says. “Finding 
anaesthetists and surgeons has already become 
more difficult” 

One European scientist, recently returned 
from two weeks at a leading institute in China, 
says that he found many Europeans setting up 
collaborations there — but they, like him, did 
not want to say so openly, for fear of damaging 
the reputations of their home institutions. 

The scientist insists that ethical concerns are 
out of place, and that standards at the insti- 
tutes match those of Germany and the United 
States. “Tt is not a question of low standards but 
of forward-looking research,’ he says. “And it 
is nice to enjoy the energy and optimism, and 
not always hear the word ‘no”” 

Back in Bremen, Kreiter still hopes to hear 
a ‘yes in court. With the support — moral and 
financial — of his university, he has spent more 
than five years fighting local authorities in a 
string of courtroom battles. He is now awaiting 
yet another verdict from a high court in Leip- 
zig. “It may be the last,” he says. “But you never 
know how things will develop.” m SEE EDITORIALP.S 


Alison Abbott is Nature’ senior European 
correspondent. 
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Make supply chains 
climate-smart 


Society’s infrastructure is hit hard by extreme weather. Networks of trade, transport 
and production need to adapt globally, says Anders Levermann. 


xtreme weather — including massive 
Hees such as Typhoon Haiyan and 
Hurricane Sandy, and severe floods 
and droughts — is likely to become more 
frequent and intense as global warming 
accelerates’. 
Links in global economic chains and 
world markets mean that extreme weather in 
one place can have repercussions elsewhere. 


For example, a combination of exceptional 
rainfall and Cyclone Yasi in 2010-11 para- 
lysed the world’s fourth-largest region of 
coal exploration in Queensland, Australia. 
Coking coal prices rose the following year by 
25%. In 2011, droughts and floods in Russia, 
Pakistan and Australia caused global food 
prices to climb, possibly contributing to 
the escalation of civil unrest in Egypt, Syria 
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and Saudi Arabia. The long-term economic 
impacts of Typhoon Haiyan, which devas- 
tated the Philippines in November 2013, are 
yet to be felt, but are likely to affect global 
trade and manufacturing. 

Yet the impacts of adverse weather on sup- 
ply chains are missing from the assessments 
of the Intergovernmental Panel on Climate 
Change’, and, with a few exceptions’, 
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> are being ignored in discussions around 
adaptation. This is a mistake. Adaptation 
requires a global strategy, not just local ones. 

Itis these unanticipated and sudden shocks 
from extreme weather events on global trade 
that are most disruptive for society; gradual 
changes can be foreseen and are easier to 
adapt to. Sitting in a bathtub with the tap 
running it is easy to stop the floor getting wet 
as the water rises by placing a few towels (up 
to a point). But the effect of climate change 
is like throwing rocks into the water. Our 
interlinked societies, the dynamics of which 
we are only beginning to understand, are like 
dominos lined up on the edge of the tub. One 
wave can make them all tumble. 

As protests in Brazil, Turkey and Greece 
in recent years have shown, societies do not 
have to be brought to the verge of starva- 
tion to descend into turmoil. Communities 
respond to events in unforeseen ways. The 
triggers might be small and the reactions 
cannot be understood with equilibrium 
economic theory’. With the fragility of our 
globalized economy becoming more evi- 
dent, it is time to readjust our focus. 


GLOBAL ADAPTIVE PRESSURES 


The influence of climate change on the 
worldwide flows of materials, electricity, 
communications and energy, including 
interactions between them and the rapid 
dynamics of volatile markets, needs to be 
modelled and understood. Asa first step, we 
must collect and share basic data on global 
supply chains. 

To this end, my colleagues and I at the 
Potsdam Institute for Climate Impact 
Research in Germany have set up a website 
called Zeean (www.zeean.net) to host such 
information and to help to kick-start a com- 
munity effort to understand and model it. By 
making these economic data available, gov- 
ernments and companies will be alerted to 
crucial bottlenecks in supply chains and can 
respond accordingly. Factoring in the costs of 
climate change will, we anticipate, allow mar- 
ket forces to help to stabilize the global supply 
network and make societies more resilient. 


SUDDEN SHOCKS 

Almost all adaptation research so far focuses 
on local or regional responses to gradual 
changes in climate. This perspective has 


Simple modelling of supply chains shows how the cessation of exports from one 
country, for example the Philippines in the wake of a typhoon such as Haiyan, will 
affect many others. Direct trade links are broken immediately and may cause 
shortages (top panel). Supply restrictions from those nations spread further, 


affecting the global economy (bottom panel). 


DIRECT IMPACT 


6% of US all 
production relies 
on supplies from 
the Philippines. 


21% of US production 
could suffer indirectly 
from supply-chain 
problems. 
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Philippine exports such 
as coconut oil have fallen 
after Typhoon Haiyan. 
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Percentage of domestic production (2011 values) affected 


28 | NATURE | VOL 506 | 6 FEBRUARY 2014 


© 2014 Macmillan Publishers Limited. All rights reserved 


steered climate-change concerns towards 
long-term threats to fragile ecosystems and 
poor rural communities, which could be 
lessened through targeted strategies, such 
as altering crop production in the face of 
shifting monsoons in India, for example. 
Meanwhile, global trade connections and 
dramatic weather events are colliding — 
with even more costly consequences. 

Thailand’s devastating 2011 floods 
destroyed the country’s automobile indus- 
try. And by disrupting manufacturing of 
mainly Japanese technology companies 
based in Thailand, the floods also caused 
a global shortage of hard-disk drives. With 
more than US$46 billion in damage’, the 
floods were ranked as the fourth-costliest 
disaster ever by the World Bank. But that 
does not include secondary losses from 
missing means of production. 

Disruption to pharmaceutical supply 
networks is already having deadly conse- 
quences. The increasingly complex supply 
chains for drugs are highly susceptible to 
blockages, causing shortages of medicines. 
In 2011, global supplies of cancer and AIDS 
medications ran short after a failed hygiene 
inspection of one US manufacturer (see 
go.nature.com/qhgfg5). Extreme weather 
events compound those risks. “Resting on 
old standards — even ones that have worked 
for decades — is no longer enough,” cau- 
tioned Robert Parkinson, president of the US 
pharmaceutical company Baxter, addressing 
a US congressional subcommittee in 2008. 

Bouts of severe weather in quick suc- 
cession are even harder to recover from. 
Pakistan, for example, is still suffering from 
devastating monsoon-induced floods in 
2010 and 2011. If hurricanes Sandy and 
Katrina had hit the US seaboard in the same 
season as last year’s drought, even the United 
States might have struggled to cope. 


KNOCK-ON EFFECTS 

A simple calculation of economic flow 
disruption illustrates how the global reper- 
cussions of weather disruptions to supply 
chains are likely to eventually exceed the 
direct damages. On the basis of data col- 
lected by Manfred Lenzen, a professor of 
sustainability research at the University of 
Sydney, Australia, and his colleagues’, we 
estimate that if not replaced, the cessation 
of exports from the Philippines, for exam- 
ple from fisheries and agriculture, would 
affect 6% of US production directly (see 
‘Global adaptive pressures’). The potential 
secondary effect, mainly through the retail 
trade, would be larger and could affect 21% 
of US production. The Philippines are 
globally the largest exporter of coconut 
oil, which is used in food products world- 
wide. For major economies such as Japan, 
Spain, the United Kingdom and the United 
States, mounting impacts on sectors such 
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as food production might greatly exceed 
the direct one. 

Adaptation, like mitigation, needs to 
be global, but that does not mean it must 
be coordinated or regulated. Because the 
main aim of adaptation is to avoid damage, 
market forces might be effective in trans- 
ferring costs. Just as the costs of insurance 
and meeting local safety requirements are 
factored into the prices of goods today, 
the costs of protecting supply chains from 
natural hazards can be included — if the 
information is available. 

In many cases it will turn out that a 
diverse but reliable supply net is more 
favourable than relying on one source, even 
if that transportation route seems cheaper 
at first sight. 

Lenzen’s group has pioneered the appli- 
cation of supply-chain data to climate and 
sustainability problems’. They have com- 
piled a database covering 26 industrial sec- 
tors in more than 180 countries, including, 
for example, the quantity of Australian coal 
used by the German steel industry and how 
much German steel is used in Russian ship 
building and Japanese car fabrication. 


COLLECT DATA 

Building on Lenzen’s data, we plan for 
Zeean to eventually cover 400 sectors and 
individual states, provinces and cities, allow- 
ing users to track the 

flows of specific goods 

at a scale appropri- 

ate for the effects of 

natural disasters. 

Users could ask, for 

example, how many 

batteries are shipped 

from Osaka, Japan, to 

California. Or what is the impact of a hur- 
ricane in Boston, Massachusetts, or a flood 
in Bangalore, India, on particular industries 
worldwide? 

In some cases, this supply-chain infor- 
mation is publicly available, but scattered 
around. In other cases, it must be deduced 
from import and export figures provided by 
national statistical agencies or government 
and industry bodies. 

We intend the information posted to Zeean 
to be cross-checked and validated by regis- 
tered and vetted users who will be assessed 
according to the quality of their input, in a 
similar way to websites such as Wikipedia. 
Single pieces of data, such as the number of 
cars produced in a region of Germany, for 
example, may be entered, as well as whole 
data sets, such as the 400-sector trade input- 
output matrices for Australia and Japan. 

The information will need to be checked 
by comparing data from different sources or 
through consistency calculations. The sum 
ofall individual flows out of a region should 
not exceed its total export, for example. 


Submerged cars from a Honda factory after floods in Thailand in 2011. 


Unlike in earlier approaches, we will not 
harmonize the information within a single 
global matrix, but will use the existing inter- 
national data structures that comprise large 
economic sectors, and apply consistency 
tests to finer data where we can. This will 
avoid the introduction of artefacts, such as 
small unrealistic flows, to bridge data gaps 
and make computations feasible. 

The processed information will be pub- 
licly available, so that small businesses and 
poor countries can use it. Only openly 
available data will be included, to avoid 
legal and rights issues. Open-source algo- 
rithms and analysis tools will be used for 
accessibility. Funding to maintain the data- 
base is being sought. 


UNDERSTANDING RISK 

With each piece of information added, 
Zeean will improve in accuracy. Users will 
be able to see a network evolving, analyse 
its connectivity and identify fragile links or 
nodes. By accommodating a variety of data, 
which need not be homogeneous, the data- 
base will allow for regional and economic 
foci of special interest. 

Longer term, to produce supply-chain risk 
assessments, these economic data should be 
combined with probability assessments of 
future climatic extremes from global and 
regional climate models, as well as models 
of smaller-scale phenomena such as hurri- 
canes® or tornadoes. Other natural-hazard 
models might be included. 

Understanding the global economic net- 
work is a huge challenge. The harmoniza- 
tion and cross-checking of the information 
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will require constant effort, and some 
inconsistencies between data on differ- 
ent scales will be unavoidable. At the same 
time, the more information that is added, the 
better the network will get. 

Although we will never be able to predict 
every impact, we must aim to provide the 
best information we can because society 
needs to decide what to do, even in the pres- 
ence of uncertainty’. With the wrong focus, 
we will protect the wrong places with the 
wrong tools. m 


Anders Levermann is professor of 
dynamics of the climate system at the 
Potsdam Institute for Climate Impact 
Research, Germany; and is at the Institute of 
Physics, Potsdam University, Germany. 
e-mail: anders.levermann@pik-potsdam.de 
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The human puzzle 


Henry Gee relishes the memoir of Svante Paabo, a leader in the field of ancient DNA. 


ta Royal Society meeting in London 
A= year, just weeks before the pub- 

lication in these pages of a high- 
quality Neanderthal genome (K. Priifer 
et al. Nature 505, 43-49; 2014), David Reich 
— one of the paper’s authors — spoke of 
“introgression” between Neanderthals, 
Homo sapiens and other hominins. This 
irked a member of the audience. “Are you 
telling me,” he asked, in cut-glass tones, 
“that these different species copulated with 
one another?” I was seized by an impulse to 
stand up and reply, in similarly stentorian 
fashion, “Not only did they copulate, but 
their union was blessed with issue!” (I stayed 


do with bones and stones, thin gruel from 
which to craft a narrative. Now we can 
extract DNA from fossils. Not just in bits 
and pieces, each as enigmatic as a broken 
tooth or a chipped stone flake — but entire 
genomes. Unlike fossils, genomes can tell 
stories. They can legitimately link species 
into skeins of common 
ancestry and descent. 
If there is one 
name associated with 
ancient DNA, it is 
Svante Paabo. Now at 
the Max Planck Insti- 
tute for Evolutionary 
Anthropology in Leip- 
zig, Germany, Paaibo 
pioneered and has 
largely led the field for 
the past three decades. 
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Neanderthal Man: 
In Search of Lost 
Genomes 


in my seat.) 

The study of human 
origins and evolution NATURE.COM 
currently stands on  Formoreon 
a cusp. For decades Neanderthals, see: 
we have had to make _go.nafure.com/do74np 


His book, Neanderthal 
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Man, is perfectly timed, beautifully written 
and required reading — it is a window onto 
the genesis of a whole new way of thinking. 
(I should add a disclaimer at this point. I 
have a walk-on part in Neanderthal Man. 
Paabo is as disarmingly candid about jour- 
nals and editors as he is about anything else. 
I get off lightly.) 

The book is primarily a memoir. Paabo 
recounts his life story with a Fennoscandian 
frankness that some readers might find dis- 
concerting. Along the way, he tells us a great 
deal about science and scientists. There is 
mercifully little of the didactic treatment of 
the structure of DNA and genes that authors 
feel obliged to rehearse on such occasions. 
Dispensing quickly with such banal neces- 
sities, Paabo gets on with the cutting-edge 
science to which he was witness, and in some 
cases helped to create — the astonishing 
development of devices that could be used 
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to sequence DNA ever more efficiently and 
at lower and lower cost. He describes the 
technology clearly, almost like a recipe book: 
you feel you should have Neanderthal Man 
on the bench as you try its techniques for 
yourself. 

Thanks to these developments, scientists 
are finding many more species of extinct 
hominin lurking out there in the shadows, 
betrayed by no known fossil evidence. For 
example, Denisovans, extinct hominins 

that lived in Sibe- 


“You feel you ria until relatively 
should have recent times, 
Neanderthal are much better 
Man on the known from their 
benchasyoutry DNA than from 
its techniques the tally of their 
for yourself.” fossils — a small, 

nondescript finger 


bone and a peculiar tooth. And yet in their 
DNA are traces of yet another unknown 
species — glimpsed only as stretches of 
nucleotides — as evanescent as the smile of 
the Cheshire Cat. 

Paabo illustrates how the advent of 
ancient DNA has already had a profound 
effect on our understanding of human evo- 
lution. Skulls and skeletons, once put away 
in cupboards lest they frighten the under- 
graduates, are being brought out into the 
light. Some of these peculiar specimens — 
such as the skull from Iwo Eleru in Nigeria 
that looks archaic but is only 13,000 years 
old — may represent evidence of a richer 
and much more diverse prehuman his- 
tory than we are used to thinking about. 
It has taken the recovery of ancient DNA, 
not more fossil bones, to jolt us into this 
wider reality, to force our gaze over a great, 
unexplored new world. 

But as Paabo recounts, there have been 
many false positives along the way. He deals 
harsh judgement on some of the grand 
claims from the Wild-West phase of ancient 
DNA research, before secure protocols had 
been established (and no, we at Nature dont 
escape his searchlight glare). And he does 
not spare himself from criticism. He looks 
back on the beginnings of his career in the 
1980s, when, torn between a fascination for 
Egyptology and biochemistry, he mixed 
the two and tried to extract DNA from an 
Egyptian mummy. He thought he was mak- 
ing history. What he made was a mess. But, 
like all true scientists, he never gave up, 
finding all sorts of ways to achieve his goals, 
inventing new techniques and new ways of 
seeing. Eventually, in 1985, he reported the 
successful cloning of DNA from a mummy, 
and history was made. = 


Henry Gee is a Senior Editor of Nature 
and the author of The Accidental Species: 
Misunderstandings of Human 
Evolution. 
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Books in brief 


A Natural History of Human Thinking 

Michael Tomasello HARVARD UNIVERSITY PRESS (2014) 

In this prequel to his 1999 Cultural Origins of Human Cognition 
(Harvard University Press), developmental psychologist Michael 
Tomasello argues that human thinking is unique because it is 
cooperative. He posits that environmental upheavals forced early 
humans to channel their thinking towards collective aims through 
two evolutionary innovations: collaboration while foraging, and the 
rise of culture as population and competition burgeoned. Tomasello 
convincingly sets out how “shared intentionality”, in which social 
complexity spawned conceptual complexities, sets us apart. 


How Numbers Rule the World: The Use and Abuse of Statistics in 
= ey Global Politics 
“Ee '3 S Lorenzo Fioramonti ZED Books (2014) 

4 Globally, we love statistics. Indexes and indicators produced by 
social-science bodies alone number in their hundreds, providing 
grist for policy mills around the world. In this intelligent study of 
pervasive quantification, Lorenzo Fioramonti questions its grip on 
society. Numerical reasoning in overdrive, he argues, can create 
distorted pictures of real life, amplify the power of markets and sap 
debate. Packed with cogent analyses of everything from credit-rating 
agencies to the manipulation of statistics by climate sceptics. 


a Oxygen: A Four Billion Year History 

Donald E. Canfield PRINCETON UNIVERSITY PRESS (2014) 

Ecologist Donald Canfield delivers an engaging and authoritative 
primer on oxygen, that vital element comprising more than one- 

fifth of our atmosphere. In tracing its 4-billion-year history, Canfield 
proffers cutting-edge findings on geological and biological questions 
from deep time. He explores Earth’s ‘Goldilocks’ status; squeezes 
into the Alvin deep-diving submersible to muse on life before oxygen; 
and probes photosynthesis, the rise of oxygenating cyanobacteria, 
stabilization of atmospheric oxygen, the ‘great oxidation event’ 

2.4 billion years ago, and the ancient links between organisms and Op. 


Cybersecurity and Cyberwar: What Everyone Needs to Know 

P. W. Singer and Allan Friedman OXFORD UNIVERSITY PRESS (2014) 
The pace of global digitization, and the widespread lack of 
understanding of related security risks, is a ticking time bomb. 
Thus argue P. W. Singer and Allan Friedman in this broad-ranging 
overview of cybersecurity. They start with basics such as software 
vulnerabilities, then delve into the implications of and solutions to 
security breaches, touching on hot issues such as resilience and the 
controversial use of overlay systems that endow online anonymity, 
such as Tor. If you don’t know your asymmetric cryptography from 
your spear phishing, this is a thoughtful introduction. 


Mindless: Why Smarter Machines are Making Dumber Humans 
Simon Head BASIC Books (2014) 

‘Computer business systems’ (CBS) are increasingly embraced in 
finance, business and health care to monitor the performance of 
employees digitally — to pernicious effect, argues Simon Head. At 
a time of deepening inequalities in wealth, he writes, such complex 
digital control of workplace behaviour disempowers those who 
can ill afford it. Head presents compelling examples of the impacts 
of CBS at Goldman Sachs, Amazon and Taiwanese electronics 
manufacturer Foxconn, among others. Barbara Kiser 
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Bedrock of China 


Xu Xing applauds a study tracing the links between Chinese nationalism and geology. 


( ‘esa science has long been tightly 
entangled with nationalism. An illu- 
minating case study is the develop- 

ment of geology during the Republican era 

(1911-49). This followed an unusual pattern, 

striking a balance between the interests of sci- 

ence, the nationalist movement, the state and 
scientists in difficult, unstable circumstances. 

Science historian Grace Yen Shen chronicles 

the field’s evolution in Unearthing the Nation. 

Shen begins with an account of foreign 
exploration in Chinese territory from the 
mid-nineteenth to the early twentieth 
centuries, such as US geologist Raphael 
Pumpelly’s investigations of the coalfields 
near the Yangtze River in the 1860s, and Ger- 
man geologist Ferdinand von Richthofen’s 
field trips across China not long after. Richt- 
hofen went on to publish milestone works 
such as the five-volume China: The Results 
of My Travels and the Studies Based Thereon 
(1877-1912). In the early twentieth century, 
Chinese researchers, including the German- 
trained Gu Lang and Zhou Shuren, pub- 
lished on geology themselves. 

Zhou (who under his pen name Lu Xun isa 
giant of Chinese fiction) was the first Chinese 
person to write on the field, in Brief Outline 
of Chinese Geology (1903). But as Shen notes, 
it was the investigations of Zhang Hongzhao, 
Ding Wenjiang, Weng Wenhao, Li Siguang 
and others around this time that marked 
the first stirrings of a homegrown discipline. 
Weng became the first Chinese geologist 
to earn a doctorate, after investigating the 


igneous formations of 
Belgium for his thesis 
at the University of 
Louvain. These pio- 
neers, Shen says, saw 
fieldwork as helping 
China to “understand 
its own territory”: sci- 
ence thus became a means of nation-building. 

Yet for years, Chinese geology remained 
internationally collaborative in terms of 
practitioners, fieldwork, institutions and 
publications. In the 1920s, China was pri- 
marily agrarian and lacked the financial and 
intellectual resources to cultivate science. 
The Geological Society of China (GSC), 
established in 1922, was the first scientific 
association initiated by Chinese investiga- 
tors. It listed among the 78 members in its 
first year 23 foreigners — including Swed- 
ish geologist Johan Gunnar Andersson, who 
contributed to the discovery of the Peking 
Man Homo erectus fossils. The Bulletin of 
the Geological Society of China, launched in 
1922 and one of the first technical journals 
dedicated to Chinese geology, was published 
mainly in Western languages, including Eng- 
lish. US geologist Amadeus Grabau (1870- 
1946), who spent most of his academic life in 
China, made huge contributions to Chinese 
palaeontology and stratigraphy, and the New 
York-based Rockefeller Foundation spon- 
sored organizations such as the Cenozoic 
Research Laboratory in Beijing, established 
in 1928 to investigate the Peking Man fossils. 


Unearthing the 
Nation: Modern 
Geology and 
Nationalism in 
Republican China 
GRACE YEN SHEN 
University of Chicago 
Press: 2014. 


Chinese geology students on a field trip in about 1950. 
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Chinese geologists persisted in fostering 
an independent discipline, even in 1927-37, 
when frequent conflicts flared between the 
government in Nanjing and local warlords, 
and within the ruling party. Weng and oth- 
ers recognized that their field could help to 
satisfy practical needs of the state such as 
the search for fossil fuels, and could build 
national pride. A platform came in 1936 
with the GSC’s Chinese-language journal 
Dizhi Lunping (Geological Review). And 
the Second Sino-Japanese War of 1937-45 
was a watershed: the drive to find natural 
resources for the war effort led to achieve- 
ments such as the discovery of China's first 
oil fields. Towards the end of the Republican 
era, a truly Chinese geological community 
had come together. 

Shen's chronicle reveals a broader trend 
in Chinese science. In the 1930s, Weng and 
several other foundational Chinese geologists 
became high-level government officials. The 
desire of Chinese intellectuals to build a great 
nation has often led outstanding researchers 
into administration and politics, a tradition 
reflected in the saying ‘Xue er you ze shi’ (Offi- 
cialdom is the natural destination for good 
scholars’). The trend persists; in the long run 
I feel that it will harm Chinese science. 

Unearthing the Nation is more than a scien- 
tific history. Shen's in-depth analysis reveals 
that national, political and cultural loyalties 
had a key role in the development of Chi- 
nese geology, and she seamlessly integrates 
this into her narrative on the discoveries and 
evolution of the field. Shen includes Chinese 
characters in the text, which makes the book 
more congenial for those who can read Chi- 
nese, and adds colour for those who cannot. 

I would have loved to see more infor- 
mation on specific scientific discoveries, 
and Shen’s tendency to focus on a limited 
number of key geologists and organizations 
sometimes obscures the larger picture. Nev- 
ertheless, this is an important book: it pre- 
sents a comprehensive history of Chinese 
geology while demonstrating the disciplines 
unique pattern of development. Implicit in 
it is the significance of openness to interna- 
tional community, even in the development 
of a national scientific discipline. = 


Xu Xing is a professor in the Institute 

of Vertebrate Paleontology and 
Paleoanthropology of the Chinese Academy 
of Sciences in Beijing. 

e-mail: xu.xing@ivpp.ac.cn 
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Trevor Cox bursts a balloon under a railway arch in Manchester, UK, to demonstrate the resulting echo. 


Trevor Cox 


The sound hunter 


Acoustical engineer Trevor Cox has designed concert halls, but recently turned to ‘sound tourism’ 
— gathering audible phenomena worldwide for his book Sonic Wonderland. He talks about 
burping sand dunes, the bass baritone of a cracking glacier and the hiss of the nervous system. 


How did you get 
into sound tourism? 
A few years ago 
in London I went 
down a Victorian 
sewer and heard 
this amazing spi- 
ralling echo. It 
made me wonder 
what other curious 
sounds might be 
out there. I decided 
to turn my training in acoustic engineering 
on its head and seek out aural distortions and 
illusions. Travel guides to the sights rarely said 
much about sounds, and I couldnt find a list 
of the most amazing sounds in the world. So I 
started a blog mapping unusual sonic spaces, 
and that became Sonic Wonderland. 


What happens with noise in a chamber? 

Whenever you make noise in an enclosed 
space, millions of reflections bounce around 
the room. Echo is when you hear a separate 
reflection, such as in a bad concert hall 
where the trumpets sound like they are com- 
ing both from the stage and from the wall 
behind you. There is a mosque in Iran where, 


when you flick a piece of paper, the echo 
bounces between the ceiling and the floor 
seven or eight times. When your brain lumps 
room reflections into one event, that is rever- 
beration. It adds a subtle bloom; without it 
your voice sounds dry and muffled. 


What are the most reverberant spaces you 
have visited? 

A mausoleum in Scotland claims to be the 
most reverberant place in the world, but it 
didn’t seem that impressive on my visit. Then 
I learned about Inchindown in northwest 
Scotland, a depot built to protect fuel during 
the Second World War. The oil tanks are the 
size of enormous cathedrals dug into the side 
of a mountain, with thick concrete walls and 
no doors, and to get in you have to go through 
the pipework. You can have a quiet conversa- 
tion in there because the walls are so far away. 
But as soon as you raise your voice this fog 
rises around you, a haze of echoes that build 
up and resonate for about a minute and a half. 


How does noise travel over large distances? 

In the early nineteenth century, sailors off the 
coast of Brazil reported hearing the sound of 
bells ringing some 160 kilometres away. That 
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may have been due partly to the concave 
shape of the ship’s sail, which might have 
reflected the sound, and to a wide layer of cold 
air over the ocean that might have refracted 
the sound back downwards to the ship. For 
similar reasons, on some nights when the 
weather is right, I can hear the crowd at Man- 
chester United's football ground quite clearly 
from my house, even though it is more than 
three kilometres away. Inside buildings, it has 
been known for centuries that a curved ceil- 
ing can transmit sound across a large room. 
In some cathedrals, if you whisper into the 
walls, the sound will skim the dome and come 
through clearly tens of metres away. 


What about the sounds of ice? 

Ice makes so many sounds. You can get the 
most catastrophic bass notes when great 
chunks of a glacier drop off into the ocean. If 
you throw rocks on to a frozen lake, it sounds 
a bit like phaser gunfire from George Lucas's 
Star Wars films. When bits of ice wash up 
around the shore ofa glacial lagoon, you get 
a gentle tinkling sound like wind chimes. 
There is a musician in Norway, Terje Isungset, 
who makes trumpets and xylophones out of 
ice. He has to source it from lakes that froze 
slowly, to ensure a regular crystalline struc- 
ture. He calls them “the only instruments you 
can drink after you've finished playing”. 


And sand dunes? 

Explorers Marco Polo and Charles Darwin 
observed that some sand dunes make rum- 
bling sounds when you walk on them, owing 
to the uniform size of the grains and whether 
they are loose and sifted. Scientists describe 
the sound as you walk on the dune as burp- 
like, but to me it sounds more like a tuba. If 
you scoot down the dune on your rear, you 
can get a couple of metres of sand to vibrate. 
With more people, more of the dune surface 
vibrates and that creates a huge avalanche of 
sound, a continuous booming that can travel 
for a kilometre or so. 


Is total silence possible? 

Before I wrote the book, my answer would 
have been no. When I work in my anechoic 
chamber, a room that deadens most sound, 
Ihave found that you cannot get rid of bod- 
ily sounds such as blood pumping through 
your head or, if you are unlucky, a hissing 
that is probably spontaneous firings on the 
auditory nerve, a bit like tinnitus. But while 
researching the book I went to some places 
— such as a sensory-deprivation flotation 
tank and a remote peat bog in Northumber- 
land, UK — where I was not conscious of any 
sound whatsoever. My suspicion is that, for 
some continuous sounds like the hissing of 
the nervous system, your brain just learns to 
ignore it after a while. m 
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Medical data: the 
choice to opt out 


You accuse the National Health 
Service (NHS) in England of 
using “sleight of hand” in the way 
we are advertising the care.data 
programme (see go.nature. 
com/srp5nu), suggesting that 

we should make it clearer to 
people that the programme poses 
potential risks to their privacy and 
that they can opt out of it (Nature 
505, 261; 2014). We believe that 
this accusation is unwarranted. 

“You have a choice” is written 
in bold on the cover of the leaflet 
about the programme, which is 
being sent to every household 
in the country. The leaflet goes 
on to say: “If you do not want 
information that identifies you 
to be shared outside your GP 
[general practitioner] practice, 
please ask the practice to make 
a note of this in your medical 
record.” 

Last month, we published 
a detailed assessment of the 
potential negative and positive 
impacts of the programme on 
privacy (see go.nature.com/ 
xcqaql). And, most importantly, 
patients have the opportunity to 
discuss the changes with a trained 
adviser and with their GP. 

It would be unethical to 
introduce this opt-out system 
without proper publicity, as 
well as illegal under the UK 
Data Protection Act 1998. This 
accounts for the scale of our 
awareness-raising strategy and 
our advice last August to all GP 
practices to start telling people 
about the proposed changes. 
Geraint Lewis NHS England, 
Leeds, UK. 
geraint.lewis@nhs.net 
Competing financial interests 
declared: see go.nature.com/ 
sluxqa for details. 


Medical data: widen 
use in research 


The Wellcome Trust and other 
UK medical-research charities 
support the plans of the National 
Health Service (NHS) in England 


to make better use of information 
from patients’ records, but we 
have no wish to downplay the 
right of people to opt out of the 
NHS care.data programme (go. 
nature.com/srp5nu), as you imply 
(Nature 505, 261; 2014). Like 

you, we believe it is critical that 
the risks, benefits and choices are 
explained clearly to everyone. 

We have launched a campaign 
to support the wider use of 
medical records for research 
through mechanisms such as 
the Clinical Practice Research 
Datalink, rather than the 
care.data programme specifically 
(see www.patientrecords.org. 
uk). It is intended to complement 
NHS England’s communications 
by highlighting the choices 
people have alongside the 
research benefits we perceive, 
and to help people to reach an 
informed decision. 

Those with concerns about 
sharing patient data are right in 
that no system can guarantee 
protection against determined 
misuse. We have confidence, 
however, in the strict safeguards 
that govern the research use 
of medical records, which can 
manage those risks while enabling 
research to benefit froma 
national cradle-to-grave data set. 
Jeremy Farrar Wellcome Trust, 
London, UK. 
j.farrar@wellcome.ac.uk 
Competing financial interests 
declared: see go.nature.com/Iskrj4 


for details. 


Planck team replies 
to data ‘anomalies’ 


We would like to clarify some 
points arising from your News 
report on the debate over data 
from the European Space 
Agency's Planck mission (see 
Nature http://doi.org/q8t; 2013). 
The cosmological parameters 
estimated by the Planck 
Collaboration are statistically 
compatible with those estimated 
by NASAss Wilkinson Microwave 
Anisotropy Probe (see 
G. Hinshaw et al. Astrophys J. 
Suppl. S. 208, 19; 2013). Also, 
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the analysis of the Planck data 
by David Spergel and colleagues 
(see preprint at http://arxiv.org/ 
abs/1312.3313; 2013) is actually 
in close agreement with our own 
(http://arxiv.org/abs/1303.5076; 
2013): the values of their 
parameters are within one 
standard deviation of ours. 

For example, their value of the 
Hubble constant is within 0.6 of 
a standard deviation of ours; the 
matter density and the amplitude 
of the fluctuation spectrum differ 
by about one standard deviation. 
These differences, which are not 
evident in our analyses of the 
Planck data, could be caused 
by methodological variations 
between the respective analyses 
rather than by systematic errors 
in the Planck data. 

We, and Spergel and colleagues, 
have verified that the small, time- 
dependent systematic errors that 
affect a subset of the data at a radio 
frequency of 217 gigahertz, which 
we reported on in the revised 
versions of the Planck papers 
from 2013, have little impact 
on the Planck Collaboration‘s 
cosmological results. 

Jan Tauber European Space 
Agency, Noordwijk, the 
Netherlands, and the Planck 
Science Team (see go.nature.com/ 
q5ltry), on behalf of the Planck 
Collaboration. 
jtauber@rssd.esa.int 


Carbon dioxide 
storage is secure 


The Sleipner gas field in the North 
Sea has the world’s first purpose- 
engineered subsea geological 
storage site for carbon dioxide. 
Contrary to your headline’s 
implication, seabed fractures do 
not pose any threat to this project 
(Nature 504, 339-340; 2013). 
Independent researchers have 
analysed extensive data from 
site monitoring using seismic- 
reflection surveys of the deep 
subsurface (both before CO, 
injection and then at two-year 
intervals); they found that 
performance is excellent, with 
no evidence of any CO, leakage 


(see A. J. Cavanagh and R. S. 
Haszeldine, Int. J. Greenh. Gas 
Con. 21, 101-112; 2014). 

Your graphic, which juxtaposes 
stored CO, with fractures, is also 
misleading: Sleipner is in fact 25 
kilometres away from the fracture 
described and is overlain by 500 
metres of sealing mudrock from 
the estimated depth of the crack. 
Elsewhere beneath the North Sea, 
mudrocks have retained natural 
CO, for tens of millions of years. 

The suggestion that leakage 
would be “a disaster for public 
opinion” is unsupported. Social- 
science research indicates that 
unintended leakage need not be 
a show-stopper (see L. Mabon 
etal. Mar. Policy 45, 9-15; 2014). 
More than guarantees that sites 
will never leak, the public seeks 
reassurance that site selection 
minimizes leakage risk, and that 
monitoring and remediation 
procedures are in place should a 
leak be discovered. 

There are many known fluid 
conduits beneath the North 
Sea, but there is no evidence 
of unplanned CO, or methane 
movement in the rocks overlying 
the storage site. Since the Sleipner 
project was set up 20 years ago, 
global endeavours have improved 
the geoscientific identification, 
operation and monitoring of 
CO, storage (see V. Scott et al. 
Nature Clim. Change 3, 105-111; 
2013). Sleipner’s CO, is securely 
retained by residual saturation in 
the reservoir, multiple mudrock 
seals, and eventual dissolution 
and dispersion in pore waters. 
Vivian Scott* Edinburgh 
University, UK. 
vivian.scott@ed.ac.uk 
*On behalf of 6 co-signatories (see 
go.nature.com/zivosz for full list). 
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John Cornforth 


(1917-2013) 


Nobel-prizewinning chemist who tracked how enzymes build cholesterol. 


ife depends on the geometric 
Limes of enzymatic reac- 

tions. Even when molecules 
are exact mirror images of each other, 
enzymes treat the ‘left-handed’ and 
‘right-handed’ versions differently. 
John Cornforth identified which of a 
series of mirror images interact with 
the enzymes that carry out the natural 
synthesis of cholesterol. This work, for 
which he shared the 1975 Nobel Prize 
in Chemistry, laid the foundations 
for many studies of how cells build 
organic compounds. 

Cornforth, who died on 8 December 
2013, was born in Sydney, Australia, in 
1917. By the time he was ten, the first 
signs of his oncoming deafness had 
become apparent. As a boy, he built 
his own rudimentary laboratory at #& 
home. And, encouraged by a school 
teacher, he entered the University of 
Sydney at the age of 16 to read chem- 
istry, a subject in which he thought his 
deafness would be less of a handicap. 
Although unable to hear the lectures, 
his thorough study of the scientific literature 
enabled him to graduate in 1937 with a first- 
class honours degree and a university prize. 

Boyhood rambles in the bush inspired 
Cornforth’s interest in natural products, and 
he began graduate studies at the University 
of Sydney. A number of his early papers were 
on the constituents of Australian plants, such 
as the caustic vine (Sarcostemma australe). 
His lifelong nickname, Kappa, arose from 
chemists’ habit at the time of engraving 
their glassware: his initials (JC) resembled 
the Greek letter. 

Cornforth’s deafness led to an intense 
loneliness that was alleviated by the com- 
panionship in the laboratory. The skills 
he developed in his home lab, of building 
and repairing experimental apparatus, had 
many benefits. One was meeting the talented 
chemist, Rita Harradence, who asked him to 
repair a flask. In 1941, she became his wife. 
Throughout his career she acted both as an 
interpreter and a collaborator; they authored 
more than 40 papers together. 

In 1939, Cornforth and Harradence were 
awarded scholarships for doctoral studies 
under Robert Robinson, an organic chemist 
at the University of Oxford, UK, who won 
the 1947 Nobel Prize in Chemistry. They 
began work on the synthesis of steroids, a 
biologically important class of complex, 


multi-ringed organic compounds that 
includes cortisone, estrone and testosterone. 
This effort eventually bore fruit in the first 
total synthesis of an androgenic hormone, 
reported in 1953 (H. M. E. Cardwell et al. 
J. Chem. Soc. 361-384; 1953). 

In 1942, as part of the joint US-UK war 
effort, the couple joined the team working 
on the structure of the antibiotic penicillin. 
Cornforth made a number of important 
contributions, including identifying and 
synthesizing penicillamine, a key degrada- 
tion product of penicillin. This work stimu- 
lated Cornforth’s investigations into the class 
of compounds known as oxazolones, includ- 
ing a type of chemical rearrangement that 
now bears his name. 

In 1946, the Cornforths moved to the 
Medical Research Council National Insti- 
tute for Medical Research in London. Here, 
Cornforth continued his work on steroid 
and oxazolone chemistry and began a very 
fruitful collaboration with the medical bio- 
chemist George Popjak, which continued 
when they became co-directors in 1962 
of Shell Research’s newly set up Milstead 
Laboratory of Chemical Enzymology in 
Sittingbourne, UK. 

Like many chemists, Cornforth was 
intrigued by how natural products are 
formed. In the years after the Second World 
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War, the radioisotope carbon-14 
became available for basic research, 
providing a way to establish the bio- 
logical building blocks of larger 
molecules. Other researchers had 
already begun to figure out the origin 
of certain carbon atoms ina side chain 
of the cholesterol molecule by using 
radioactively labelled acetate, a small 
organic compound containing only 
two carbon atoms. Cornforth took on 
the more demanding experimental 
work required to establish the origin 
of each of the carbon atoms in choles- 
terol’s four conjoined molecular rings. 

He identified 14 steps in the early 
stages of the natural formation of 
cholesterol. In each of these steps, the 
intermediate products could be trans- 
formed in one of two ways. His design 
for labelling experiments defined a 
single pathway out of the 16,384 (2"*) 
possibilities. 

In another series of experiments, 
on acetic acid, Cornforth labelled the 
hydrogen atoms around a carbon, 
replacing the hydrogens with the isotopes 
deuterium and tritium such that each had a 
distinct position around carbon. These clas- 
sic experiments opened up the possibility of 
exploring a wide range of enzyme reactions, 
including fatty-acid biosynthesis. 

In 1975, the same year that he won the 
Nobel prize for decoding the stereochem- 
istry of biosynthetic reactions, Cornforth 
accepted a Royal Society research professor- 
ship at the University of Sussex, UK. There, 
he began an extremely ambitious project to 
craft a compound that could act as an ana- 
logue for hydratase, the enzyme that adds 
water to another molecule. 

He was knighted in 1977 and made a 
Companion of the Order of Australia in 
1991. Kappa lectured undergraduates and 
supervised student projects until he was 
well into his 80s, often enhancing conver- 
sations with an aptly worded limerick. His 
kindness, generosity and humour were 
appreciated by all with whom he came into 
contact. 


Jim Hanson is professor emeritus of 
chemistry at the University of Sussex, 
Brighton, UK. He worked in the same 
laboratory as John Cornforth for three 
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e-mail: j.r.hanson@sussex.ac.uk 


EBRUARY 2014 | VOL 506 | NATURE | 35 


BETTMANN/CORBIS. 


NEWS & VIEWS 


FORUM: Microbiology 


A talented genus 


Members of a newly described candidate bacterial genus, Entotheonella, have been identified as the sources of the rich 
array of natural products found in the marine sponge Theonella swinhoei. Two scientists discuss this discovery from the 
perspectives of microbial ecology and drug discovery. SEE ARTICLE P.58 


Hidden depths 


MARCEL JASPARS 


n this issue, Wilson et al.’ describe the 

discovery of two new bacterial species 
with large genomes and rich biosynthetic 
repertoires. This combination is so rare that the 
new phylum to which they have been assigned 
might be heralded as the successor to the 
Actinobacteria, the phylum responsible for 
many of the world’s antibiotics and anticancer 
agents. How did their discovery come about? 
By studying sponges: organisms identified by 
the pioneers of marine natural-product chem- 
istry as the source of unparalleled chemical 
diversity. 

This identification raised questions about 
how sponges produce such a range of com- 
pounds and what their roles might be. The vari- 
ety of chemical reactions observed seemed too 
broad to be produced bya single organism, and 
sponges collected from different locations had 
different and often non-overlapping metabolite 
profiles. It was only when it was noticed that 
similar compounds could be found in organ- 
isms as divergent as sponges and beetles that a 
common, microbial origin was suggested’. 

Early work to confirm this idea involved 
blending and centrifuging sponges to sepa- 
rate cell populations, which revealed that the 
microorganisms living in the sponges had bio- 
synthetic repertoires distinct from those of the 
sponge cells”. Subsequent studies tried to get 
a clearer picture of which organism produced 
which compound, but the possibility of com- 
pounds moving from the true producer to other 
organisms obfuscated a clear interpretation’. 

Despite this, evidence mounted that the 
producer was a bacterium named Candida- 
tus Entotheonella palauensis* (Candidatus 
indicates that the bacterium had not yet been 
cultured). Subsequent comparisons showed 
that the pathways responsible for produc- 
ing the related compounds pederin (isolated 
from the Paederus beetle) and theopederin A 
(isolated from the sponge Theonella swinhoei) 
were highly similar and probably came from a 
bacterium associated with the sponge and the 


beetle*°. However, this combined evidence, 
although suggestive, did not quite complete 
the loop between microorganism, biosynthetic 
genes and chemistry. 

Wilson and co-workers’ study finally closes 
these gaps in our understanding and, in doing 
so, reveals hidden depths of biosynthetic 
capacity in a candidate phylum that they 
name Tectomicrobia (from the Latin tegere, 
to hide, to protect). The authors combined 
previous experimental separation methods 
with whole-genome sequencing of candidate 
organisms to assess the number and range of 
biosynthetic gene clusters present in members 
of the phylum’s only genus discovered so far, 
Entotheonella. There is now incontrovertible 
evidence that T. swinhoei is host to this genus, 
and that Entotheonella species have large 
genomes (greater than 9 megabases), of which 
a high proportion is dedicated to natural- 
product biosynthesis (Fig. 1). 

The authors assigned gene clusters to the 
biosynthesis of several compounds identified 
in T. swinhoei extracts, including onnamides/ 
theopederins, polytheonamides, kerama- 
mides/orbiculamides and cyclotheonamides, 
and identified a further 24 biosynthetic clus- 

ters with predicted 


There is little or unknown prod- 
apparent overlap ucts. There is little 
in biosynth etic app arent overlap in 
repertoire biosynthetic reper- 

toire between the 
sell the two Entotheonella 
fl i epee species the authors 
m obo avast have so far described, 
potentia for new indicating a vast 
chemistry in this potential for new 
phylum. chemistry in this 


phylum. Thus, it 
seems that members of Tectomicrobia are tal- 
ented producers of chemical diversity, similar to 
the Actinobacteria and Cyanobacteria, which 
both include species with large genomes and 
many biosynthetic gene clusters. It also seems 
that Tectomicrobia are widespread: Wilson 
et al. analysed 37 taxonomically diverse sponge 
species from 20 locations, including some from 
geographically distant regions, and found 
Entotheonella species in 28 of the samples. 
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This study shows that new biosynthetically 
talented microorganisms can be discovered, 
and suggests that systematic searches will yield 
further species in this phylum, as well as new 
phyla. Questions that remain include whether 
marine sponges are the only hosts for this phy- 
lum or whether it is more widespread; what 
the benefit is to the sponge of hosting such a 
talented symbiont; and how its presence in the 
sponge is controlled. 


Marcel Jaspars is at the Marine Biodiscovery 
Centre, Department of Chemistry, University 
of Aberdeen, Old Aberdeen AB24 3UE, UK. 
e-mail: m.jaspars@abdn.ac.uk 


Supply and source 


Beale natural products isolated from 
sponges and other marine animals offer 
interesting possibilities for treating cancer and 
other diseases. However, obtaining sufficient 
quantities of such metabolites from the marine 
environment for clinical trials is challenging. 
Wilson and colleagues’ identification of bacte- 
ria from the candidate genus Entotheonella as 
the producers of most of the metabolites iso- 
lated from T: swinhoei suggests new approaches 
for overcoming this supply problem. 

Natural products have diverse applications 
in medicine and agriculture. Iconic examples 
include penicillins and cephalosporins, used 
to treat bacterial infections; the cancer drug 
paclitaxel (Taxol); artemisinin, which targets 
the malaria parasite; the cholesterol-lowering 
drug lovastatin; and the insecticide spinosyn. 
The overwhelming majority of such com- 
pounds are produced by plants or terrestrial 
microorganisms. 

Although marine sponges are another 
important source of bioactive natural prod- 
ucts, only a handful of sponge natural products 
have entered the market. This is due primarily 
to the supply problem. For example, consider- 
able quantities of a drug candidate are required 
for clinical trials, but only a few milligrams of 


A, TOSHIYUK! WAKIMOTO; B, TETSUSHI MORI 


Figure 1 | Sponge’s secret. Wilson et al.' show that many of the chemically diverse natural products 
found in the marine sponge Theonella swinhoei (a) are produced by biosynthetically talented bacterial 
symbionts (b; false-coloured), which they assign to the candidate genus Entotheonella. 


most natural products can be isolated from the 
marine environment and it has hitherto proved 
impossible to cultivate the sponges from which 
such compounds are derived. 

One approach to solving this problem is 
total chemical synthesis, which has been 
successfully used to produce the anticancer 
compounds discodermolide’ and eribulin’. 
However, the structural complexity of most 
marine natural products means that develop- 
ing efficient routes for their total synthesis is 
challenging. Another approach involves semi- 
synthesis from a structurally related metabo- 
lite. This has been applied’ to the anticancer 
agent trabectedin, isolated from a sea squirt, 
which can be synthesized from safracin B pro- 
duced by a cultivable terrestrial bacterium. But 
such routes are viable only ifan abundant sup- 
ply of an appropriate precursor is available. 

Evidence has been mounting that unculti- 
vated bacterial symbionts of marine sponges, 
rather than the sponges themselves, are the true 
producers of many bioactive metabolites®”’. 
However, it has been unclear whether several 
microbial inhabitants are responsible, or just 
one. Wilson et al. have answered this question, 
although it remains to be seen whether their 
report of Entotheonella species being responsi- 
ble for producing the diverse array of metabo- 
lites isolated from the sponge is a widespread 
phenomenon among other sponges. 

The authors’ findings illuminate two prom- 
ising approaches for addressing the supply 
problem. The first is large-scale cultivation of 
the microorganisms that produce interesting 
metabolites. This is likely to prove difficult, but 
the ability to obtain a draft genome sequence 
from a single microbial cell, as exemplified by 
Wilson and colleagues, may help to determine 
optimal culture conditions for the organisms. 


This comes with the caveat, however, that 
growing such microorganisms in pure culture 
might downregulate their production of bio- 
active metabolites — the biosynthetic path- 
ways for similar metabolites in easily cultivable 
terrestrial microorganisms are often expressed 
poorly in pure cultures, or not at all, presum- 
ably because the environmental cues responsi- 
ble for eliciting them are absent. Thus, genetic 
manipulation may be required to maintain 
desirable levels of metabolite production”. 
The second potential way to address the 
supply problem involves expressing the bio- 
synthetic pathway of interest in an easily cultiva- 
ble surrogate host. This synthetic-biology tactic 
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has been used to produce a key intermediate of 
artemisinin biosynthesis in yeast’. Genome- 
sequence data may help to guide selection of 
the most appropriate surrogate, but extensive 
genetic manipulation will probably be required 
to optimize the production of each metabolite. 
Wilson et al. also show that, as is the case for 
terrestrial bacteria such as Streptomyces spe- 
cies’*"*, Entotheonella species contain several 
pathways that hint at their ability to assemble 
previously unknown metabolites. This sug- 
gests that members of the genus might serve 
as a useful source of leads for drug discovery. m 
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Interference identifies 
immune modulators 


A broad in vivo screen of the effects of specific gene inhibition on the antitumour 
activity of immune cells in mice bearing melanomas has revealed potential 
targets for cancer therapy. SEE ARTICLE P.52 


LARS ZENDER 


herapies designed to boost the immune 
system’s response to tumours hold great 
promise for overcoming drug resistance 
in cancer. Advanced solid tumours inevitably 
develop resistance against currently available 
cytotoxic or molecularly targeted therapies, but 
durable responses have been observed following 


some immunotherapeutic treatments, leading 
to speculation that a cure for some patients 
could be possible. On page 52 of this issue, 
Zhou et al.’ use RNA-interference technology 
to identify genes that can be targeted to enhance 
the robustness and proliferation of immune cells 
called CD8* T cells in mice bearing melanomas. 

Two existing targets for cancer immuno- 
therapy are the receptor molecules CTLA-4 
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Melanoma-bearing 
mouse 


Tumour 


Further 
analysis 


Figure 1 | Screening for T-cell modulators. Zhou et al.’ compiled libraries of short hairpin RNA (shRNA) molecules designed to specifically inhibit the 
expression of genes expressed by dysfunctional (anergic or exhausted) CD8* T cells or of genes encoding cell-signalling enzymes (phosphatases and kinases). The 
authors then infected CD8* T cells with these shRNAs and implanted the cells into mice bearing melanomas. Seven days later, they isolated T cells from the spleens 
and tumours of the mice and compared the relative representation of shRNA molecules in the two tissues. Those shRNA molecules found to be enriched in the 
tumour were postulated to be involved in facilitating T-cell survival or proliferation within the tumours and were selected for further analysis. 


and PD-1, which are expressed on the surface 
of T cells and transmit signals that dampen 
the cells’ immune activity. Antibodies that 
bind these receptors, and thereby relieve the 
immune inhibition, have emerged as a power- 
ful treatment for patients with advanced 
melanoma””, and the anti-CTLA-4 antibody 
ipilimumab was the first immunotherapy 
shown to significantly improve the overall 
survival of these patients*. However, patient 
responses to tumour immunotherapy are 
highly variable, and it is unclear why some 
tumours respond and others do not. Thus, 
to improve the efficiency of such treatments 
further, a deeper understanding and better 
mechanistic characterization of antitumour 
immune responses are needed. 

The discovery of RNA interference (RNAi) 
— the process by which small RNA molecules 
specifically inhibit gene expression by bind- 
ing to and inducing the cleavage of messenger 
RNAs — has revolutionized loss-of-function 
genetic studies. The experimental applica- 
tion of RNAi, using collections of short- 
interfering RNA (siRNA) or short hairpin 
RNA (shRNA) molecules, means that screens 
of gene function can be conducted in almost 
every biological system. For example, shRNA 
screens have been used successfully to dissect 
tumour-suppressor gene networks, to identify 
modulators of drug resistance and to pinpoint 
vulnerabilities in cancer cells’ *. Furthermore, 
in vitro shRNA screening was recently applied 
to identify genes that regulate the differentia- 
tion of T cells into T-helper 1 and 2 subsets’. 
However, RNAi-based functional genetic 
screens are commonly performed in vitro, and 
as such do not take into account the effects of 
tumour microenvironment on cancer growth 
or immune-cell function. 

Zhou and colleagues have taken shRNA 
screening in immune cells to the next level by 
building on advances in stable shRNA tech- 
nology and in vivo shRNA screening’ *”°. The 
authors compiled two thematically focused 
libraries of shRNA molecules (Fig. 1). The first 
comprised 1,275 shRNAs targeting 255 genes 


whose expression is associated with T-cell 
exhaustion or anergy — states of functional 
inactivity that arise in cancer. The second 
contained 6,535 shRNAs targeting 1,307 genes 
encoding kinase and phosphatase enzymes 
involved in cell-signalling pathways. These 
shRNAs were then delivered by lentiviruses 
into activated mouse CD8* T cells carrying a 
specific T-cell receptor (OT-1); those cells that 
stably expressed shRNA molecules were then 
implanted into mice harbouring aggressive 
melanomas that expressed a model-antigen 
protein (Ova), which can activate the OT-1 
T-cell receptor. 

Seven days later, the OT-1 T cells were puri- 
fied from the tumours and spleens of the mice 
and analysed to identify shRNA molecules that 
were substantially more highly represented in 
tumoural than in splenic T cells — the impli- 
cation being that the genes targeted by these 
shRNAs are involved in mediating survival 
or proliferation of T cells in tumours. Top- 
scoring genes were then screened again using 
15 different shRNAs targeting each candidate 
gene. In addition to several shRNAs target- 
ing genes with known functions in T cells, the 
authors identified shRNAs against Ppp2r2d, a 
regulatory subunit of the phosphatase enzymes 
of the PP2A family, as strongly enriched 
in tumours. 

In further experiments, the researchers 
found that shRNA-induced inhibition of 
Ppp2r2d expression increased the survival of 
CD8* T cells within the tumour and resulted 
in increased intratumoural CD8* T-cell 
proliferation. Most importantly, systemic 
delivery of melanoma-targeted CD8" T cells 
expressing an anti-Ppp2r2d shRNA resulted 
in increased death of melanoma cells, a signifi- 
cantly reduced tumour burden over time and 
prolonged survival of tumour-bearing mice. 
Strikingly, Ppp2r2d suppression not only 
improved the antitumour activity of CD8* 
T cells, but also increased that of another class 
ofimmune cell, CD4" T cells, thus suggesting a 
broad applicability of Ppp2r2d as a therapeutic 
target in T cells. 
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Zhou and colleagues’ paper breaks ground 
on several levels. Their study sets a new stand- 
ard in how the function of immune cells can 
be genetically dissected by RNAi screening 
in vivo. On the basis of the authors’ data, similar 
screens in genetically engineered mouse 
models of different tumour types seem feasi- 
ble and should be given high priority. Further- 
more, functional screens specifically aimed at 
identifying modulators of CD4* T-cell func- 
tion should be pursued. 

The report also suggests an exciting poten- 
tial for targeting Ppp2r2d in T cells for cancer 
therapy, assuming that the increased anti- 
tumour T-cell function observed following 
Ppp2r2d inhibition can be validated in other 
models — ideally ones that do not depend on 
model antigens and cell transplantation. It 
will also be interesting to explore the use of 
Ppp2r2d inhibition in conjunction with other 
immunotherapies. For example, Zhou et al. 
showed that shRNA-mediated suppression 
of Ppp2r2d did not reduce the expression of 
the inhibitory receptors PD-1 or LAG-3 on 
tumour-infiltrating T cells, so the effect of 
combining Ppp2r2d inhibitors with PD-1 or 
LAG-3 blockers should be investigated. m 
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JENNIFER K. BALCH 


ATMOSPHERIC SCIENCE 


Drought and fire 
change sink to source 


Aircraft have captured the ‘breath’ of the Amazon forest — carbon emissions over 
the Amazon basin. The findings raise concerns about the effects of future drought 
and call for a reassessment of how fire is used in the region. SEE LETTER P.76 


JENNIFER K. BALCH 


he Amazon forest accounts for 40% 

of the aboveground biomass stored in 

the world’s tropical forests’, but we do 
not know whether this crucial but threatened 
biome will be a sink or a source of atmos- 
pheric carbon in the coming decades”. Given 
the need to predict future climate scenarios, 
it is essential to refine our understanding of 
tropical forests’ ability to sequester or release 
carbon’. The profiling of air columns over such 
forests by aircraft offers a much-needed win- 
dow onto the major fluxes of tropical carbon. 
On page 76 of this issue, Gatti et al.* report 
the first estimate of carbon fluxes from the 
Amazon basin obtained in this way over the 
course of two years. Their findings suggest 
that the combined effects of drought and fires 
can cause the Amazon forest to become a net 
source of atmospheric carbon. 

The authors sampled air masses several 
kilometres above the forest canopy at four 
Amazon locations, creating a patchwork of 
atmospheric profiles of carbon dioxide and 
carbon monoxide that spans the entire Amazon 
basin. They conducted these measurements 
during a major drought year (2010) anda 
relatively wet year (2011) for the region. 

The researchers found that, during the 
drought year, burning of vegetation associated 
with land use and reduced photosynthesis 
resulted in 0.48 + 0.18 petagrams of carbon 
(Pg C; 1 Pg is 10° grams) being lost from the 
Amazon forest biome (uptake by the biome 
was 0.03 + 0.22 Pg C per year; fire emissions 
were 0.51+0.12 Pg C per year). During 
the wetter 2011, however, the Amazon was 
effectively carbon neutral: biome uptake 
(0.25 +0.14 Pg C per year) very nearly cancelled 
the fire emissions (0.30 +0.10 Pg C per year). 
Temperatures were above average in both 
years, but similar, suggesting that a moisture 
deficit reduced photosynthesis rates in 2010, 
rather than the crossing of a temperature 
threshold. 

The growth rate of atmospheric CO, levels 
observed over the past five decades at Mauna 
Loa, Hawaii, and at the South Pole was recently 
shown’ to be highly sensitive to year-to-year 
variability in tropical temperatures, and is 
further moderated by moisture conditions. 


This finding, taken together with Gatti and 
colleagues’ study, implies that a shift in the 
terrestrial carbon cycle may be occurring 
because of the sensitivity to drought of tropi- 
cal forests globally. 

The world’s vegetation takes up about 
2.6 + 0.7 Pg C per year, compared with around 
9 Pg C per year emitted to the atmosphere, 
mostly as CO,, from fossil-fuel combustion 
and cement production®. The Amazon forest 
accumulated an average of 0.4 Pg C per year 
in the two decades before 2005 — a range 
of 0.3-0.6 Pg C per year, estimated through 
repeated sampling of nearly 100 permanent 
plots across the basin’ — and so has had a sub- 
stantial role in offsetting global anthropogenic 
emissions of greenhouse gases. Whether this 
annual uptake will persist and compensate for 
emissions related to drought and land use in 
the future remains uncertain. 

Gatti and colleagues’ approach captures 
bi-weekly atmosphere-biosphere gas exchange 
across millions of square kilometres, the first 
time that this has been done at sucha scale and 
for so long. Their method surpasses the spatial 
and temporal restrictions of, as well as some of 
the assumptions associated with, other meth- 
ods such as plot-level inventories or modelling 
based on satellite data. 

Atmospheric profiling using aircraft is a 
crucial tool in our understanding of Ama- 
zon carbon fluxes, and has the potential — if 
a pan-tropical network of aircraft observa- 
tions can be established — to determine how 
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tropical forests worldwide are responding to the 
combined threats of increasing drought and 
land-use pressures. A big advantage of this 
method is that it integrates emissions and 
uptake from naturally occurring land and 
river processes with land-use emissions to give 
a regional picture of total carbon fluxes. How- 
ever, understanding the drivers and mecha- 
nisms behind these fluxes is key for the future 
management of carbon in tropical regions. 

Given the importance of fire in shifting 
the Amazon basin from a sink to a source 
of carbon, one of the next steps is to recon- 
cile the different fire types that contribute to 
the authors’ regional estimates of fire emis- 
sions. Gatti and co-workers’ vertical profiling 
detected carbon monoxide, which could have 
been caused by fires used for deforestation 
(Fig. 1), land management (pasture burning 
and ‘slash-and-burr agriculture, for exam- 
ple) and escaped understory wildfires*. More 
than 85,000 square kilometres of otherwise 
intact forests burned in understory fires in 
the southern Amazon during the 2000s, and, 
in dry years, the area affected can exceed the 
area deforested for agriculture and pasture’. 
These fires kill 8-64% of mature trees across 
Amazon forest sites'’, and burn biomass", 
thereby reducing forest carbon stocks. 
Teasing out the different land-use drivers that 
contribute to overall fire emissions is essential 
to aid fire-prevention and fire-management 
strategies that could help to reduce those 
emissions. 

Because drought frequency and intensity in 
the Amazon might increase in the future’, the 
authors’ results are concerning. Furthermore, 
during the period of the study, deforestation 
rates were the lowest they had been since the 
records of Brazil’s National Institute for Space 
Research began in 1988. The substantial fire 
emissions documented by the authors dur- 
ing their study therefore imply that efforts 
to reduce deforestation must also address 
the use of fire as a land-management tool. In 
sum, if drought and fire frequencies increase 
in the future, they may override the Amazon's 
function as a carbon sink. = 


4 


Figure 1 | Deforestation fire in the southeastern Amazon. Gatti et al.’ report that a combination of 
severe drought and fires associated with land use can shift the Amazon region from being a sink toa 


source of atmospheric carbon. 
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Oiling the wheels 
of autoimmunity 


Oily substances in the skin have now been shown to contain structures that 
activate a population of skin- homing, self-reactive T cells. The responses of these 
immune cells may contribute to local defences, but also to autoimmune disease. 


MITCHELL KRONENBERG 
& WENDY L. HAVRAN 


the immune system responds to foreign 

entities while remaining tolerant to ‘self’ 
structures. This is not strictly true, however, 
because there are specialized populations of 
immune cells that are self-reactive. Such cells 
have the potential to initiate undesirable auto- 
immune reactions, so their existence raises 
several questions. What are the origin and 
structure of the self-antigens to which these 
cells respond, and how is this potentially dan- 
gerous self-recognition regulated? Reporting 
in Nature Immunology, de Jong et al.’ identify 
hydrophobic self-antigens in the skin that are 
recognized in an unusual manner by a special- 
ized subset of skin-resident immune cells. 

B and T cells are the white blood cells 
responsible for immune recognition in the 
adaptive immune system. Populations of self- 
reactive T cells reside in or near the epithelial 
surfaces of the skin and intestine’, where 
there are rich concentrations of microorgan- 
isms. Although it may seem paradoxical that 
self-reactivity is prevalent where microbes are 
abundant, it is possible that self-reactivity at 
these surfaces involves ‘sentinel’ immune cells 
that can rapidly respond to general signs of cel- 
lular stress or barrier disruption, without the 
need for specific recognition of microbes. 

The antigens recognized by most T cells 
are peptides that are displayed on the sur- 
face of other cells, bound in the groove of 
antigen-presenting proteins of the major 
histocompatibility complex (MHC) family. 
Lipid antigens, by contrast, are bound by the 
hydrophobic grooves of CD1 antigen-present- 
ing proteins, which are related to the MHC 
proteins. Humans have four CD1 proteins’: 
CD1la, CD1b, CD1c and CD1d. Some self 


[ficnmene students are taught that 


and microbial lipid antigens that bind to CD1 
proteins have been identified, but research on 
this antigen-presentation system has mostly 
been restricted to CD1d molecules. 

De Jong et al. concentrated on T cells that 
recognize antigens bound to CD1a, which 
are more prevalent in human blood than 
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T cells recognizing other CD1 proteins”. 
CD 1a-reactive T cells are also found in the skin; 
when stimulated, these cells produce IL-22 
(ref. 5), a cytokine protein involved in micro- 
bial defence and in inducing the proliferation 
of skin cells called keratinocytes. Moreover, 
Langerhans cells, which are antigen-present- 
ing cells that reside in the skin’s epidermal layer, 
express particularly high amounts of CD1a. 
The authors show that a CD1a molecule 
purified from a human cell line activates CDla 
self-reactive T cells by binding to their anti- 
gen receptor. The antigen-binding grooves 
of MHC and CD1 proteins are always filled, 
but the proteins do not discriminate self from 
non-self — this is the job of the antigen recep- 
tors that react to them. Therefore, the authors 
sought to uncover the CD1a-bound antigens 
that triggered the self-reactive T cells. Using 
mass-spectrometry analysis, they found more 
than 100 molecules corresponding in mass to 


Figure 1 | Skin-antigen recognition by self-reactive T cells. a, T cells recognize peptide antigens bound 
to MHC molecules on the surface of antigen-presenting cells, or lipid antigens that are presented by CD1 
proteins. In both cases, the antigen protrudes from the groove of the antigen-presenting molecule to 
engage the T-cell receptor. De Jong et al.' show that some T cells that react to CD1a molecules (a subset 

of the CD1 family) recognize oily substances found in sebum — a hydrophobic layer secreted onto the 
outermost layer of the skin. These self-antigens nestle deep within the CD 1a groove, such that there might 
be direct contact only between CD1a and the T-cell receptor. b, The authors propose that skin-barrier 
disruption, through trauma or infection, allows Langerhans cells, which express CD 1a, to acquire oily 
antigens from sebum and move from the skin’s epidermis to the dermis. There, they make contact with 
self-reactive T cells and activate them to produce cell-signalling molecules, such as IL-22, that promote an 
immune response to barrier disruption without requiring specific recognition of invading microbes. 
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lipids (glycolipids and phospholipids) with 
different chain lengths and degrees of unsatu- 
ration. They then used several strategies to 
detect antigenic activity in this melange, 
including tests of synthetic molecules that 
had the same molecular masses as the mater- 
ial eluted from CD 1a, and of partially purified 
lipids from various cell types. 

The authors found that epidermal lipids 
bound to CD1a were more stimulatory for the 
CD 1a-reactive T cells than lipids from other 
cell types. More specifically, they showed 
that CD 1a-reactive T cells recognized highly 
hydrophobic compounds such as squalene and 
triacylglyceride, which are found naturally in 
the skin, when these were bound to CD 1a. This 
reactivity was selective, and other hydrophobic 
molecules such as cholesterol were not recog- 
nized by the cells. 

Although CD1 molecules have a hydro- 
phobic antigen-binding groove, the com- 
pletely hydrophobic character of the antigens 
presented by CD1a is surprising. T-cell anti- 
gen receptors typically recognize a composite 
structure of the antigen-presenting molecule 
and the antigen, with the exposed portion of 
the antigen participating in engaging the T-cell 
receptor (Fig. 1a). The exposed portion of lipid 
antigens is normally hydrophilic, containing 
a sugar or phosphate group, with the hydro- 
phobic chains buried in the CD1 groove’. But 
the CD la-binding self-reactive T cells did not 
obey this rule, because they did not require an 
exposed hydrophilic portion of the stimulatory 
lipid-containing antigen. 

In fact, binding of lipids bearing hydrophilic 
groups to CD 1a inhibited the response of these 
T cells, presumably by competing with more 
strongly hydrophobic antigens for binding into 
the CD 1a groove. Therefore, it seems that the 
self-reactive T-cell antigen receptor requires a 
view of CD1a that is unimpeded by exposed 
hydrophilic groups; the bound lipid may sim- 
ply permit or stabilize CD 1a into the correct 
conformation. Asa consequence, rather than 
recognizing single compounds with high 
specificity, these T cells can be stimulated by 
a range of highly hydrophobic substances that 
fit in the CD1a groove. 

The skin forms a barrier to microbes through 
the generation of sebum — a highly hydro- 
phobic substance synthesized in the sebaceous 
glands and secreted onto the outermost layer 
of the skin. Using microdissected sebaceous 
glands, de Jong et al. demonstrated that sebum 
is highly stimulatory for CD1a-dependent 
self-reactive T cells, and that it is rich in anti- 
genic compounds, such as squalene. However, 
sebum is not typically in contact with the 
underlying dermal and epidermal layers that 
contain T cells and Langerhans cells. In nor- 
mal skin, this physical separation may prevent 
CD1a-binding self-reactive T cells from being 
constantly exposed to their antigens. But dis- 
ruption of the skin barrier by injury, infection 
or inflammation might allow sebum contents 


to permeate the epidermis and bind to CD 1la- 
expressing Langerhans cells, thereby stimulat- 
ing T-cell responses (Fig. 1b). Although this 
may aid general immune defences, in cases of 
prolonged barrier disruption, constant expo- 
sure of immune cells to sebum could contribute 
to autoimmune skin diseases such as psoriasis 
and atopic dermatitis. 

Interestingly, squalene is currently used as 
an immune booster (adjuvant) to enhance 
the efficacy of vaccines and immunothera- 
pies, and asa carrier for topical delivery to 
hair follicles of drugs for treating hair loss’. 
An autoimmune syndrome has also been 
described that is induced by adjuvants, includ- 
ing squalene’. It is possible that activation of 
CD 1a-binding self-reactive T cells contrib- 
utes to this compound's immune-stimulating 
effects. Certainly, further investigation of the 
regulation of these T-cell responses to skin oils 
is warranted, both for understanding immu- 
nity and autoimmunity and in light of the 
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increasing therapeutic use of such agents. = 
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Quarks are not 
ambidextrous 


By separately scattering right- and left-handed electrons off quarks ina 
deuterium target, researchers have improved, by about a factor of five, ona 
classic result of mirror-symmetry breaking from 35 years ago. SEE LETTER P.67 


WILLIAM J. MARCIANO 


ymmetry makes the world go round. 
Scientific theories of the physics of 


elementary particles stem from simple 
symmetries that dictate the fundamental 
forces governing our Universe. Sometimes 
symmetries are broken, and that can have 
profound implications. An important case is 
the reflection, or right-left mirror, symmetry 
known as parity. On page 67 of this issue, an 
international team at the Thomas Jefferson 
National Accelerator Facility in Newport 
News, Virginia, reports’ measurements of 
parity-symmetry breaking that confirm expec- 
tations and that unambiguously separate the 
electron and (much smaller) quark parity- 
violating interactions. The small quark parity 
violation can be used as a sensitive probe of 
new interactions or to measure subtle nuclear 
effects. 

Elementary particles such as electrons and 
quarks (which make up protons and neutrons) 
carry intrinsic angular momentum called spin 
and act much like spinning tops. By conven- 
tion, particles spinning clockwise with respect 
to their direction of motion are said to be left- 
handed, whereas their mirror images — those 


spinning anticlockwise — are right-handed. 
Parity symmetry swaps left and right, just as 
a mirror does. 

Gravity, electromagnetism and strong 
nuclear forces all respect parity; that is, they 
are symmetrical (unchanged) under left-right 
interchanges. However, in 1956, Tsung-Dao 
Lee and Chen-Ning Yang conjectured’ that 
the weak forces responsible for nuclear decays 
and neutrino interactions might violate parity. 
Subsequent experiments not only confirmed 
that feature, but also found that parity violation 
was maximal: only left-handed particles expe- 
rienced the weak interaction; right-handed 
particles were not affected by the weak forces 
that were known then. Antiparticles, such as 
antielectrons and antiquarks, exhibited the 
opposite preference — only their right-handed 
components participated in weak interactions. 
For the revolutionary idea of parity violation, 
Lee and Yang received the physics Nobel prize 
in 1957. 

Beyond parity violation, small differences 
between the weak interactions of left-handed 
particles and those of right-handed anti- 
particles, known as CP violation or matter— 
antimatter asymmetry, were subsequently 
observed’. Today, some as yet undiscovered 
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form of CP violation is thought to be responsible 
for the dominance of matter over antimatter 
throughout the Universe — a feature responsi- 
ble for our very existence. Symmetry violation 
can, indeed, have profound consequences. 

Apart from parity violation, electromagnetic 
and weak interactions are quite similar. Both 
can be viewed as exchanges of packets (quanta) 
of energy called bosons. Electromagnetism is 
mediated by massless photons, whereas heavy, 
charged W bosons mediate weak interactions. 
Although some sort of electroweak unifica- 
tion, jointly describing both interactions, 
seemed natural’, parity violation caused 
problems. In 1961, it was shown’ that unifi- 
cation was possible if, in addition to charged 
W bosons, another heavy neutral boson, now 
called the Z boson, also existed. Unfortunately, 
even then, parity violation made it difficult to 
accommodate or relate elementary-particle 
masses. The problem was solved in 1967, 
when it was demonstrated’ how the introduc- 
tion of symmetry breaking through the Higgs 
mechanism could be used to provide mass. A 
predicted remnant of that mechanism — the 
Higgs boson — was detected in 2012 at CERN, 
Europe's high-energy physics laboratory near 
Geneva, Switzerland, and Francois Englert 
and Peter Higgs were awarded last year’s Nobel 
Prize in Physics for the theoretical work on the 
Higgs mechanism. 

In the early 1970s, support for the existence 
of the Z boson was observed in neutrino- 
scattering experiments’. But follow-up stud- 
ies proved inconclusive, in that they did not 
confirm the parity-violating predictions of 
electroweak unification. Then an experiment*” 
called E122, conducted at the SLAC National 
Accelerator Laboratory in Menlo Park, Cali- 
fornia, measured a small parity-violating dif- 
ference between the scattering of right- and 
left-handed electrons on up and down quarks 
in a target of deuterium atoms. The up and 
down quarks are the lightest of the six possible 
types of quark, and make up all nuclei. This 
result unequivocally confirmed the parity- 
violating predictions of electroweak unifica- 
tion. For their work on electroweak unification 
and its implications, Sheldon Lee Glashow’, 
Abdus Salam" and Steven Weinberg* received 
the physics Nobel prize in 1979. 

During the 35 years since E122 was com- 
pleted, better sources of right- and left-handed 
electrons have been developed, experimental 
techniques have improved and more-intense 
electron beams have become available. Par- 
ity violation has been used for the precise 
measurement of parameters that describe 
the electroweak interaction and to investigate 
nuclear properties. But the parity-violating dif- 
ference measured in the E122 experiment has 
not been improved on — until now. 

In their study, the Jefferson Lab team 
decided to redo the SLAC E122 experiment. 
The researchers worked at lower energy but 
with much higher intensity and polarization 


(degree of handedness). As a result, they 
improved on some aspects of parity-violating 
differences between the scattering of right- and 
left-handed electrons on up and down quarks 
by about a factor of five. With their higher 
statistics, they were able to untangle the two 
parity-violating effects: the dominant effect due 
to electron parity violation, which had already 
been clearly measured in E122, and a much 
smaller parity-violating effect attributable to 
the quarks in the deuterium nuclei, which was 
beyond the sensitivity of the SLAC experiment. 
Why measure such small effects, and so 
precisely? Perhaps, like mountain-climbing 
enthusiasts, physicists study them because they 
are there and represent challenges. However, 
unlike mountains, in the case of parity-violat- 
ing effects sometimes smaller is better. Testing 
the tiny quark parity-violation prediction is a 
nice example: a deviation from expectations 
could signal the presence of a new tiny effect. 
Indeed, the team’s measurement probes some 
types of additional parity-violating effects 
that could be as much as 30 times weaker than 
ordinary weak forces. Precision studies also 
provide access to small nuclear effects that are 
hard to probe in other ways. An example is the 
breaking of charge symmetry (the interchange 
of up and down quarks in deuterium). 
Parity-violating polarized electron scatter- 
ing experiments are expected to continue at 
the Jefferson Lab, using higher-energy elec- 
trons and better particle-detection systems, 
after upgrades to the facility, now in progress, 
are completed. One can anticipate better 
measurements of electroweak parameters, 


more-refined nuclear-physics studies and 
improved searches for new interactions. 

A great accomplishment can lead to the 
demise of a scientific endeavour. A good exam- 
ple is the race to put a man on the Moon. That 
goal started more than 50 years ago and was a 
spectacular success, but further undertakings 
ended after the mission was accomplished. 
Fortunately, electron-scattering studies of par- 
ity violation did not suffer that fate. Following 
the success of E122 at SLAC, the programme 
changed direction, but improvements in tech- 
nical expertise and accelerator facilities con- 
tinued. The Jefferson Lab has taken leadership 
in polarized-electron scattering initiatives. 
As long as these initiatives address frontier 
questions and interesting goals, they should 
prosper and grow. = 
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Plant diversity rooted 
in pathogens 


Ecologists have long pondered how so many species of plant can coexist locally 
in tropical forests. It seems that fungal pathogens have a central role, by 
disadvantaging species where they are locally common. SEE LETTER P.85 


HELENE C. MULLER-LANDAU 


ropical forests routinely contain more 

than 200 tree species in a single hectare 

(Fig. 1). Why don’t a few species come 
to dominate, by chance or by virtue of being 
better competitors? Multiple hypotheses 
have been proposed to answer this question, 
most of which invoke some sort of niche 
differentiation with respect to resources 
and/or natural enemies. But despite decades 
of research, the issue remains unresolved. In 
this issue, Bagchi et al.’ (page 85) report the 
results of an elegant field study that clearly 
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implicates natural enemies, specifically fungal 
pathogens, as crucial to maintaining tropical- 
plant diversity. 

In 1970, ecologists Daniel Janzen’ and 
Joseph Connell’ proposed that natural 
enemies that target specific host plants 
maintain high tropical-plant diversity by 
elevating the mortality of each plant species 
in areas where it is abundant. Fundamentally, 
the idea is that host-specialized enemies, 
including pathogens and insect herbivores, 
can attack more efficiently and do more dam- 
age where their hosts are more plentiful. As a 
result, each host species fares better when it is 
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rare and less well as it becomes more 
common — a phenomenon known as 
negative density dependence. Many 
empirical studies have found such 
negative density dependence in tropi- 
cal forests*”, and the Janzen—Connell 
hypothesis is the most often cited 
explanation for these patterns and for 
high local diversity of plant species in 
tropical forests. However, niche dif- 
ferences in resource requirements or 
other factors could also cause nega- 
tive-density-dependent patterns’, and 
few studies have explicitly linked such 
patterns to particular natural enemies 
(although see ref. 7 for an exception). 

Bagchi et al. tested this hypothesis 
experimentally by using pesticides 
to remove (or at least reduce) fungal 
pathogens and, separately, insects 
at the seedling-establishment stage. 
Working in a tropical forest in Belize, 
the authors censused seeds falling into 
seed traps and seedlings that became 
established in neighbouring 1-square- 
metre plots that were treated with a 
fungicide or with an insecticide, or 
not treated. In untreated plots, seed- 
ling establishment was negatively den- 
sity dependent and there was a large 
increase in local species diversity from 
the seed to the seedling stage, consist- 
ent with previous work*. Bagchi and 
colleagues’ crucial findings were that fungicide 
application resulted in the near disappearance 
of negative density dependence and a drop in 
seedling species diversity. By contrast, insecti- 
cide application merely weakened negative 
density dependence and led to no change 
in species diversity, although it did increase 
the total number of seedlings and caused a 
dramatic shift in species composition. 

This is the first study to explicitly link a 
particular group of natural enemies to nega- 
tive density dependence and the maintenance 
of species diversity in tropical forest plants. 
It clearly implicates fungal pathogens as the 
most important drivers of these patterns at 
the seedling-establishment stage. In the past, 
there have been more studies of insects than 
of pathogens as agents of the Janzen—Connell 
effect — no doubt owing in large part to the 
greater ease of working with insects. Although 
insect attack has been found to increase with 
host-plant density in several tropical plant spe- 
cies®, the ability of insects to respond to high 
host density, and thus induce negative density 
dependence, may ultimately be restricted by 
their own enemies, such as parasites or preda- 
tors’. Pathogens seem less likely to be similarly 
checked, which may explain their greater con- 
tribution to negative density dependence. 

Bagchi and colleagues’ results demonstrate 
that fungal pathogens and insect herbivores 
influence tropical plant communities in quali- 
tatively different ways. Their distinct roles 


Figure 1 | Shades of green. The forest canopy on Barro Colorado 
Island, Panama, provides visual evidence of how small areas can 
contain many different tropical tree species. Bagchi and colleagues’ 
findings' suggest that fungal pathogens play a crucial part in 
maintaining this diversity. 


relate to the two ways in which differences 
among plant responses to natural enemies can 
affect species diversity and composition. First, 
as discussed above, differences in natural ene- 
mies can contribute to niche differences that 
stabilize individual species’ abundances and 
species diversity. Alternatively, or in addition, 
they can alter differences in competitive ability 
(fitness) among species”’, thereby modifying 
species abundances, and potentially which 
species can successfully compete at all. The 
large shifts in species composition seen dur- 
ing insecticide treatment suggest that insects 
have major impacts on fitness differences in 
this ecosystem. Overall, it seems that fungal 
pathogens are more important determinants 
of niche differences and thus species diversity, 
whereas insects have greater influence on fit- 
ness differences and thus species composition. 

This work also contributes novel obser- 
vations on the strength of negative density 
dependence in different species. Contrasting 
hypotheses predict either a negative relation- 
ship between a species’ average abundance 
and its negative density dependence if greater 
abundance makes a species more apparent 
to its enemies, or a positive one if the cau- 
sality is reversed and lower negative density 
dependence leads to increased abundance’. 
Previous studies*”’ that quantified tree abun- 
dance over large areas have found that more- 
abundant species experience less negative 
density dependence. Bagchi et al. find that 
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species that are more abundant as seeds 
suffer stronger negative density depend- 
ence, which at first seems to contradict 
these earlier findings. However, seed 
abundance depends not only on tree 
abundance but also on seed size, which 
varies widely among tropical tree spe- 
cies. Small-seeded species are likely to 
be particularly vulnerable to natural 
enemies, and small seeds are produced 
in greater numbers, and thus differences 
in seed size may reconcile Bagchi and 
colleagues’ results with previous work. 
Future studies should seek to disentan- 
gle the roles of species traits and abun- 
dances in driving interspecific variation 
in negative density dependence. 

Indeed, this groundbreaking experi- 
mental work lays the foundation for a 
host of studies exploring the roles of 
natural enemies in structuring tropi- 
cal-plant diversity. Bagchi et al. investi- 
gated effects on seedling establishment, 
a single life stage — integration of such 
effects over the entire life cycle will ulti- 
mately provide a more complete picture. 
Replication of these experiments across 
climatic gradients could also test the 
idea that some climates are more con- 
ducive to natural enemies, and that this 
contributes to greater species diversity 
of forests in these areas. Furthermore, 
such comparative studies or others 
that explicitly manipulate temperature, rain- 
fall or atmospheric carbon dioxide could 
address how global change affects interactions 
between plants and natural enemies and 
thereby illuminate the future of tropical plant 
diversity. = 
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Although it is generally agreed that the Arctic flora is among the youngest and least diverse on Earth, the processes that 
shaped it are poorly understood. Here we present 50 thousand years (kyr) of Arctic vegetation history, derived from the 
first large-scale ancient DNA metabarcoding study of circumpolar plant diversity. For this interval we also explore 
nematode diversity as a proxy for modelling vegetation cover and soil quality, and diets of herbivorous megafaunal 
mammals, many of which became extinct around 10 kyr Bp (before present). For much of the period investigated, Arctic 
vegetation consisted of dry steppe-tundra dominated by forbs (non-graminoid herbaceous vascular plants). During the 
Last Glacial Maximum (25-15 kyr Bp), diversity declined markedly, although forbs remained dominant. Much changed 
after 10 kyr Bp, with the appearance of moist tundra dominated by woody plants and graminoids. Our analyses indicate 
that both graminoids and forbs would have featured in megafaunal diets. As such, our findings question the predom- 
inance of a Late Quaternary graminoid-dominated Arctic mammoth steppe. 


It can be argued that Arctic vegetation during the proximal Quater- 
nary (the last circa 50 kyr) is less well understood than the ecology and 
population dynamics of the mammals that consumed it, despite the 
overall uniformity and low floristic diversity of Arctic vegetation’”. Ana- 
lyses of vegetation changes during this interval have been based mainly 
on fossil pollen. Although highly informative, records tend to be biased 
towards high pollen producers such as many graminoids (grasses, sedges 
and rushes) and Artemisia, which can obscure the abundance of other 
forms such as many insect-pollinated forbs’. Arctic pollen records are 
rarely comprehensively identified to species level, which underestimates 
actual diversity’. These problems are to some extent ameliorated by plant 
macrofossil studies (for example, ref. 4), which may provide detailed 
records of local vegetation. However, macrofossil studies are far less 


common, have their own taxonomic constraints, and usually cannot 
provide quantitative estimates of abundance. 

In recent years, a complementary approach has emerged that uses 
plant and animal ancient DNA preserved in permafrost sediments’. 
Such environmental DNA® does not derive primarily from pollen, 
bones or teeth, but likely from above- and below-ground plant bio- 
mass, faeces, discarded cells and urine preserved in sediments’. Like 
macrofossils, environmental DNA appears to be local in origin®’”"'” 
and, in principle, the survival of a few fragmented DNA molecules is 
sufficient for retrieval and taxonomic identification”. 

Environmental DNA can supply the fraction of the plant community 
not readily identifiable by pollen analysis and, to some extent, macro- 
fossils, particularly in vegetation dominated by non-woody growth forms’. 
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For most plant groups, DNA permits identification at lower taxon- 
omic levels than pollen’. In addition, environmental DNA records 
have proven to reflect not only the qualitative but also the quantitative 
diversity of above-ground plant’* and animal taxa’, as determined 
from modern subsurface soils. 

Leaching of DNA through successive stratigraphic zones may be an 
issue in temperate conditions””’ but not in permafrost® or in sediments 
that have only recently thawed"*. Re-deposition of sediments and organics 
can confound results, which is also the case for pollen and macrofossils”"*, 
but can be avoided and accounted for by careful site selection and by 
excluding rare DNA sequence reads'*. For Quaternary permafrost 
settings, at least, taphonomic bias due to differences in DNA survival 
across plant groups does not appear to be of concern (see Methods 
section 4.0 on taphonomy), as has been shown by a comparative per- 
mafrost ancient DNA study of plants and their associated fungi’. 


Reconstruction of Arctic vegetation from permafrost 

We collected 242 sediment samples from 21 sites across the Arctic 
(Fig. 1 and Extended Data Table 1). Ages were determined by accel- 
erator mass spectrometry radiocarbon ('*C) dating, and are reported 
here in thousands of calibrated (calendar) years Bp (Extended Data 
Fig. 1 and Supplementary Data 1). We sequenced the short P6 loop 
sequence of the trnL plastid (gene encoding chloroplast transfer RNA 
for leucine) region and a part of the ITS1 spacer region through meta- 
barcoding (Methods section 3.0), generating a total of 14,601,839 trnL 
plant DNA sequence reads and 1,652,857 internal transcribed spacer 
(ITS) reads. Reads were identified by comparison with (1) the Arctic 
trnL taxonomic reference library, which we extended with ITS sequences 
for three families; (2) a new north boreal trnL taxonomic reference library 
constructed by sequencing 1,332 modern plant samples representing 
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Figure 1 | Sample localities. A total of 242 permafrost samples were collected 
from 21 sites, shown by green dots (1-21). Eight ancient megafauna gut and 
coprolite samples (A—H) are shown by grey hollow circles, and seven modern 
nematode localities are shown by grey hollow triangles (a-g). (1) Anadyr, 

(2) Baskura Peninsula, (3) Bol’shaya Balakhnaya, (4) Buor Khaya, (5) Cape 
Sabler, (6) Colesdalen, (7) Duvanny Yar, (8) Endalen, (9) Federov Island, 
(10) Goldbottom, (11) Khatanga, (12) Maine River, (13) Ovrazhny Peninsula, 
(14) Purgatory, (15) Quartz Creek, (16) Ross Mine, (17) Stevens Village, 

(18) Stuphallet, (19) Taimyr Lake, (20) Upper Taymyr River, (21) Zagoskin 
Lake, (A) Drevniy Creek Mammoth, (B) Bison, (C) Lyuba Mammoth, 

(D) Kolyma Rhino, (E) Last Chance Creek Horse, (F) Churapcha Rhino, 

(G) Mongochen Mammoth, (H) Finish Creek Valley Mammoth, (a) Blackstone 
River, (b) Ogilvie Mountains, (c) Eagle Plains South, (d) Eagle Plains North, 
(e) Little Atlin Lake, (f) Kluane Lake, (g) Carmacks. 
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835 species; and (3) GenBank, using the program ecoTag (Supplemen- 
tary Data 2 and Methods section 3.0). Basic statistics, in silico analyses, 
and additional experiments were carried out to check data reliability 
(Extended Data Fig. 2 and Extended Data Table 2). We grouped the 
identified molecular operational taxonomic units (MOTUs) into three 
distinct intervals (Fig. 2a): (1) pre-Last Glacial Maximum (LGM) (50- 
25 kyr Bp), a period of fluctuating climate; (2) LGM (25-15 kyr sp), a 
period of constantly cold and dry conditions; and (3) post-LGM (15- 
0 kyr Bp), which includes the current interglacial, characterized by rela- 
tively higher temperatures’”. 


Shifts in plant community composition 


To address compositional changes in vegetation across space and time 
we used a generalized linear model and permutational multivariate ana- 
lysis of variance (PERMANOVA) (Supplementary Data 3 and Methods 
section 6.0). We find that (1) the composition of plant MOTU assem- 
blages differed significantly across the three intervals (pseudo-F = 6.77, 
P<0.001, Extended Data Fig. 3a—-e), with pre-LGM and post-LGM 
plant assemblages differing the most (Extended Data Fig. 3f); (2) the 
greater the spatial distance separating a pair of samples within each time 
period, the less similar their composition (P< 0.001); and (3) LGM 
assemblages were the most homogeneous across space and post-LGM 
assemblages were the most heterogeneous (Fig. 2). 

LGM pollen spectra show high floristic richness compared to other 
intervals (for example, ref. 1). This is due to the limited occurrence of 
woody taxa with high pollen production, which in turn proportion- 
ately emphasizes less-productive taxa. By contrast, our DNA data reveal 
that plant diversity was lowest during LGM relative to other intervals 
(Fig. 2a). Plant assemblages became more similar to each other and 
the estimated number of MOTUs decreased from pre-LGM to LGM 
(Fig. 2a), with many taxa absent that had previously been well represented 
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Figure 2 | Taxonomic diversity of Arctic plant assemblages during the last 
50kyr. Taxon composition was estimated by high-throughput sequencing of 
DNA from 242 permafrost samples. A total of 154 MOTUs were detected. 

a, Index of ambient temperature (continuous line; oxygen isotope 
concentration, North Greenland Ice Core Project, NGRIP*’) and estimated 
MOTU number (horizontal bars; second-order jackknife) are shown for three 
palaeoclimatic periods: pre-LGM (>25 kyr Bp, n = 149), LGM (25-15 kyr Bp, 
n = 32) and post-LGM (<15 kyr sp, n = 61). b, MOTU counts recorded 
uniquely in each palaeoclimatic period and shared among periods. c, Modelled 
decline in similarity (1 — Bray-Curtis (BC) dissimilarity) between pairs of plant 
assemblages from the same palaeoclimatic period in relation to the spatial 
distance separating them. 
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(Fig. 2b). In addition, although the LGM flora was largely a subset of 
the pre-LGM flora, the post-LGM flora was different (Fig. 2b), with 
pronounced geographic differentiation (Fig. 2c). 


Steppe-tundra 

Owing to the low taxonomic resolution of previously published vege- 
tation reconstructions, it remains undetermined whether Arctic vege- 
tation during the last part of the Quaternary was a form of tundra or 
more like steppe (for example, refs 18, 19). Small-scale contemporary 
analogues range from low-productivity fellfields and cryoxeric steppe 
communities to more productive dry Arctic steppe-to-tundra gradients. 
Our sediment DNA plant sequence data from ~50-12 kyr BP encom- 
pass taxa that typify both tundra and Arctic steppe environments. These 
include taxa that are today typical of dry and/or disturbed sites (for 
example, Bromus pumpellianus, Artemisia frigida, Plantago canescens, 
Anemone patens), saline soils (Puccinellia, Armeria), moist habitats 
(Caltha) and rocky or fellfield habitats (Dryas, Draba), plus a woody 
component dominated by Salix (Supplementary Data 4 and 5). A spa- 
tial and/or temporal mosaic of plant communities is indicated (Methods 
section 6.0), as is seen in floristically rich macrofossil records*. The most 
common MOTU in the pre-LGM and LGM samples is Anthemideae 
group 1 (Artemisia, Achillea, Chrysanthemum, Tanacetum), which under- 
scores the importance in regional pollen assemblages of Asteraceae in 
general and Artemisia in particular’. Equisetum and Eriophorum are 
important only in postglacial assemblages, reflecting moister soil con- 
ditions. Increases in aquatic taxa (Supplementary Data 4 and 5) also 
indicate a predominance of moister substrates in the later part of the 
post-LGM period. These findings indicate a shift from dry steppe- 
tundra to moist tundra in the early part of the post-LGM period—a 
change widely reported in other proxy studies. 

Nematode assemblage composition is known to change with vege- 
tation cover”, moisture’’ and organic resource inputs”’. Therefore, to 
obtain a complementary proxy for vegetation cover and soil quality, 
we characterized the soil nematode fauna of contemporary mesic shrub 
tundra and subarctic steppe on well-drained loess soils in Yukon Terri- 
tory, Canada (Fig. 1 and Extended Data Table 3). The relative proportion 
of the nematode families Teratocephalidae and Cephalobidae varied 
among vegetation types (P < 0.001, nested ANOVA), and indicator spe- 
cies analysis*’ confirmed that Teratocephalidae (indicator value = 0.98, 
P= 0.001) and Cephalobidae (indicator value = 0.98, P = 0.001) are 
very good indicators of tundra and steppe vegetation, respectively (Fig. 3). 
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Figure 3 | Proportional abundance of two families—Teratocephalidae and 
Cephalobidae—among the total soil nematode community at contemporary 
tundra and steppe sites in Yukon, Canada. Teratocephalidae, dark; 
Cephalobidae, light. Letters a~g correspond to sample localities (Fig. 1). Median 
(central dot), quartile (box), maximum and minimum (whiskers) and outlying 
values (points) are shown. 
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These findings are in agreement with previous studies restricted to sub- 
arctic Sweden™*” and alpine and subalpine habitats**’’. We amplified 
short DNA sequences from these two taxa from 17 sediment samples 
analysed for plant DNA from Yukon and northeastern Siberia. We 
detected Cephalobidae DNA in almost all samples, whereas Terato- 
cephalidae was detected at a higher frequency in samples younger than 
10 kyr sp than in the pre-LGM and LGM samples (Extended Data 
Table 4). These results support our inferences from plant sequence 
data and indicate a transition from relatively dry tundra and steppe 
towards more moist tundra during the post-LGM interval. 


Forb dominance and megafaunal diets 


To assess structural and functional shifts in the plant assemblages, we 
investigated temporal changes in the relative abundance of different 
growth forms. Our DNA results show that pre-LGM vegetation was 
dominated by forbs, the relative share of which increased during the 
LGM, whereas graminoids constituted less than 20% of the total read 
count (Fig. 4a). These results persisted when we corrected for observed 
modern representational bias'* (Methods sections 4.0 and 5.3). 

Continued forb dominance during the LGM implies that similar 
proportions of forbs and graminoids were maintained through this 
period, despite the significant decline in floristic diversity (Fig. 2a, b). 
Our findings contrast with pollen-based reconstructions, which have 
emphasized dominance of graminoids in the unglaciated Arctic and 
adjacent regions, particularly during the LGM, and are exemplified by 
the widely used term mammoth steppe”. Rather, our results show that 
vegetation was forb-dominated in both overall abundance of MOTUs 
and in floristic richness (Fig. 4a, b and Extended Data Fig. 3g, h), in 
agreement with macrofossil data that show a diversity of forbs of mixed 
ecological preference (for example ref. 4). 

We explored whether forbs were prominent in habitats favoured by 
megafauna by analysing 25 dated (47-20 kyr Bp) sediment samples 
from Main River, Siberia, using trnL plastid plant and 16S mitochon- 
drial DNA mammal primers. We found that the mean proportion of 
forbs was higher in samples from which herbivorous megafaunal DNA 
had been retrieved (n = 18; for example, woolly mammoth, woolly rhi- 
noceros, horse, reindeer and elk) than in samples lacking such DNA 
(n = 7; Fig. 4c and Extended Data Table 5). Although suggestive of co- 
occurrence of megafauna in forb-dominated settings, these results should 
be regarded as tentative, and further studies are needed to verify if this 
is indeed a general trend. 

We also investigated whether megafaunal diets revealed the level 
of forb dominance observed in permafrost sediment samples. Using 
standardized methods, we genetically characterized intestinal/stomach 
contents and coprolites recovered from eight specimens of woolly mam- 
moth, woolly rhinoceros, bison and horse from Siberia and Alaska, 
dated >55-21 kyr Bp (Extended Data Table 6 and Methods sections 
3.0 and 7.3). Although ingested plant remains are often difficult to iden- 
tify morphologically, they can be accurately identified” and roughly 
quantified’? using DNA. The majority of these samples are dominated 
by forbs, which comprise 0.63 + 0.12 of the sequences, compared to 
0.27 + 0.16 expressing graminoid sequences (Fig. 4d and Supplemen- 
tary Data 6). These results suggest that megafaunal species supplemented 
their diets with high-protein forbs rather than specializing more or less 
exclusively on grasses. 

To confirm the reliability of our trnL approach for estimating herbi- 
vore diet, we analysed 50 rumen samples of sheep-feed diets with varying 
proportions of forbs (white clover (Trifolium repens)) and graminoids 
(ryegrass (Lolium perenne)) (Methods section 5.4). As seen in Fig. 4e, 
the Pearson correlation coefficient between the actual fraction of forbs 
in these diets and the proportion of forbs estimated with the DNA- 
based approach was highly significant (r7 = 0.75, P< 107'”). 


Discussion 


Our observations of high forb abundance in the Terminal Pleistocene 
may merely reflect vegetation response to glacial climates, but there are 
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Figure 4 | Plant growth form composition over 
time and across sample types, estimated by high- 
throughput sequencing of DNA from 242 
permafrost samples. a, Proportions of DNA reads 
corresponding to taxa exhibiting different growth 
forms, binned over 5 kyr time intervals. The 
analysis included all sediment samples except 21 

' Svalbard samples and three further samples for 
which no growth form information was available. 
b, Number of MOTUs exhibiting different growth 
forms as a proportion of total MOTU richness in all 
informative samples for each palaeoclimatic 
period. c, The proportional abundance of forbs in 
samples from Main River, Siberia (dated 47,100- 
19,850 yr BP) where megafauna were or were not 
detected. d, Proportions of DNA reads 
corresponding to different growth forms in 
megafauna diet, determined from analysis of eight 
gut and coprolite samples from late Quaternary 
megafauna species (woolly mammoth, woolly 
rhinoceros, bison and horse). Letters A-H 
correspond to the individual samples (Fig. 1). The 
95.4% calibrated age range of each sample is shown; 
‘> 55° indicates that the sample was too old to 
provide a finite radiocarbon age. e, Reliability of the 
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trnL approach for estimating forb and graminoid 
abundance in diet analyses. Sheep were fed with 
known amounts of forbs (Trifolium repens) and 
graminoids (Lolium perenne), and the rumen 
content analysed using the same DNA-based 
approach as implemented above. Grey dots are raw 
data points, orange dots and lines represent the 
means and + standard errors for diets containing 
different fractions of forbs. The grey line is a linear 
model fit. Numbers immediately below the 
columns in a, b and c indicate sample sizes. Median 
(central dot), quartile (box), maximum and 
minimum (whiskers) values are shown in a and c. 
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other possibilities’. An abundant megafauna would have caused signi- 
ficant trampling”, enhancing gap-based recruitment”, which could favour 
forbs*’. Coupled with nitrogen input from wide-ranging herbivores™, 
forbs may out-compete grasses**. Furthermore, a diet rich in forbs may 
help to explain how numerous large animals were sustained; forbs may 
be more nutrient-rich (for example, ref. 35) and more easily digested”® 
than grasses. However, a feedback loop that maintained nutritious and 
productive forage and supported large mammalian populations in gla- 
cial climate regimes may have been impossible to maintain after degla- 
ciation, as C:N ratios increased with global warming”, and the potential 
breakdown of the megafauna-forb interaction would have been exac- 
erbated by declining mammalian populations. In contemporary tundra 
and steppe (the latter often called grasslands), graminoids are generally 
perceived to be the dominant growth form in large herbivore habitats 
(for example, refs 38, 39). Our data, which unearth 50 kyr of Arctic 
vegetation history, call this perception into question. 


METHODS SUMMARY 


Plant fragments or soil matrix organics were “*C-dated using accelerator mass spec- 
trometry and measured ages were converted into calendar years*’. Permafrost sam- 
pling, DNA extraction, PCR amplification and taxon identification (for example, 
ref. 41) followed established procedures. Most vascular taxa are covered by ref. 42, 
and nomenclature is provided accordingly; for the remaining taxa nomenclature 
follows ref. 43. Dissimilarity between plant assemblages was quantified using pairwise 
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Bray-Curtis distance“’. Variation in assemblage dissimilarity was decomposed using 
PERMANOVA* and visualized using non-metric multidimensional scaling”. 
We used a distance decay approach** and a generalized linear model to model vari- 
ation in plant community assemblages over space and time. Growth form compo- 
sition of communities was compiled from species trait databases’. Differences in 
the trait composition of assemblages in adjacent climatic periods were compared to 
a null model assuming random assortment from the previous interval. Nematode 
faunas of 35 contemporary sediment samples were morphologically determined. 
Presence of two indicator families (Teratocephalidae for tundra and Cephalobidae 
for steppe) was genetically determined in 17 ancient sediment samples. Mega- 
faunal DNA and faeces and gut content were determined genetically following 
established methods. For a detailed discussion, see Methods. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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In vivo discovery of immunotherapy 
targets in the tumour microenvironment 


Penghui Zhou, Donald R. Shaffer!*+, Diana A. Alvarez Arias', Yukoh Nakazaki!, Wouter Pos', Alexis J. Torres”, Viviana Cremasco', 
Stephanie K. Dougan®, Glenn S. Cowley*, Kutlu Elpek!+, Jennifer Brogdon®, John Lamb®, Shannon J. Turley’, Hidde L. Ploegh?, 
David E. Root’, J. Christopher Love’, Glenn Dranoff', Nir Hacohen’‘, Harvey Cantor! & Kai W. Wucherpfennig" 


Recent clinical trials showed that targeting of inhibitory receptors on T cells induces durable responses in a subset of 
cancer patients, despite advanced disease. However, the regulatory switches controlling T-cell function in immuno- 
suppressive tumours are not well understood. Here we show that such inhibitory mechanisms can be systematically dis- 
covered in the tumour microenvironment. We devised an in vivo pooled short hairpin RNA (shRNA) screen in which 
shRNAs targeting negative regulators became highly enriched in murine tumours by releasing a block on T-cell proliferation 
upon tumour antigen recognition. Such shRNAs were identified by deep sequencing of the shRNA cassette from T cells 
infiltrating tumour or control tissues. One of the target genes was Ppp2r2d, a regulatory subunit of the PP2A phosphatase 
family. In tumours, Ppp2r2d knockdown inhibited T-cell apoptosis and enhanced T-cell proliferation as well as cytokine 


production. Key regulators of immune function can therefore be discovered in relevant tissue microenvironments. 


Recent work has shown that cytotoxic T cells have a central role in 
immune-mediated control of cancer’. T cells are able to specifically 
detect and eliminate cancer cells following T-cell receptor (TCR)-mediated 
recognition of tumour-derived peptides bound to MHC proteins®. A 
series of studies have convincingly demonstrated that the extent of 
tumour infiltration by cytotoxic T cells is a critical factor determining 
the natural progression of diverse types of cancers'*?""'. A landmark 
study showed that the type, density and location of cytotoxic T cells 
within tumours enabled better prediction of patient survival than histo- 
pathological methods used for staging of cancers’. Strong infiltration of 
both the tumour centre and the invasive tumour margin by cytotoxic 
T cells (which express the CD8 surface marker) was shown to correlate 
with a favourable prognosis, regardless of the local extent of tumour 
invasion and spread to local lymph nodes. Conversely, weak in situ expan- 
sion of CD8 T cells correlated with a poor prognosis even in patients with 
minimal tumour invasion’. However, in the majority of patients this natural 
defence mechanism is severely blunted by immunosuppressive cell popu- 
lations recruited to the tumour microenvironment, including regula- 
tory T cells, immature myeloid cell populations and tumour-associated 
macrophages*’*"“*. Highly complex interactions among a variety of differ- 
ent cell types in the tumour microenvironment—including tumour cells, 
immune cells and stromal cells—therefore contribute to clinical outcome. 

The critical role of T cells in immune-mediated control of cancers is 
further underscored by therapeutic benefit following administration 
of monoclonal antibodies targeting inhibitory receptors on T cells, CTLA-4 
and PD-1'*""*. Clinical benefit is enhanced by co-administration of 
antibodies targeting CTLA-4 and PD-1"*”*. Particularly notable is the 
finding that such antibodies can induce durable responses in a subset 
of patients with advanced disease. However, many of the regulatory 
pathways in T cells that result in loss of function within immuno- 
suppressive tumour microenvironments remain unknown. 

Immune cells perform complex surveillance functions throughout 
the body and interact with many different types of cells in distinct tissue 


microenvironments. Therapeutic targets for modulating immune res- 
ponses are typically identified in vitro and tested in animal models at a 
late stage of the process. We postulated that the complex interactions of 
immune cells within tissues, many of which do not occur in vitro, offer 
untapped opportunities for therapeutic intervention. Here we have 
addressed the challenge of how targets for immune modulation can 
be systematically discovered in vivo. 


Design of in vivo discovery approach 

Pooled shRNA libraries have been shown to be powerful discovery 
tools****. We reasoned that shRNAs capable of restoring CD8 T-cell 
function can be systematically discovered in vivo by taking advantage 
of the extensive proliferative capacity of T cells following triggering of 
the TCR by a tumour-associated antigen. When introduced into T cells, 
only a small subset of shRNAs from a pool will restore T-cell prolifera- 
tion, resulting in their enrichment within tumours. Over-representation 
of active shRNAs within a pool can be quantified by deep sequencing of 
the shRNA cassette from tumours and secondary lymphoid organs 
(Fig. 1a). 

We chose to study B16 melanoma, an aggressive tumour that is 
difficult to treat**. Melanoma cells expressed the surrogate tumour anti- 
gen ovalbumin (Ova), which is recognized by CD8 T cells from OT-I 
T-cell receptor transgenic mice”*’*. Initial experiments showed that 
such a screen could also be performed with pmel-1 T cells that recog- 
nize gp100, an endogenous melanoma antigen”, but the signal/noise 
ratio was lower for pmel-1 T cells owing to smaller T-cell populations 
in tumours. Naive T cells are difficult to infect with lentiviral vectors, 
and we therefore pretreated T cells for two days with the homeostatic 
cytokines IL-7 and IL-15 before spin infection with shRNA pools in a 
lentiviral vector. Successful transduction was monitored by surface 
expression of the Thyl.1 reporter (Extended Data Fig. 1a). T cells were 
injected into B6 mice bearing day 14 B16-Ova tumours. Seven days 
later, T cells were purified from tumours and secondary lymphoid 
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Figure 1 | In vivo RNAi discovery of immunotherapy targets. a, In vivo 
discovery approach for negative regulators of T-cell function in tumours. T cells 
infected with shRNA libraries were injected into tumour-bearing mice; 
shRNAs that enabled T-cell accumulation in tumours were identified by deep 
sequencing of the shRNA cassette from purified T cells. b, Deep sequencing 


organs (spleen, tumour-draining and irrelevant lymph nodes) for isola- 
tion of genomic DNA, followed by PCR amplification of the shRNA 
cassette (Extended Data Fig. 1b). The representation of shRNAs was 
then quantified in different tissues by Illumina sequencing. 


In vivo shRNA pool screens 

Two large screens were performed, with the first focusing on genes 
overexpressed in dysfunctional T cells (T-cell anergy or exhaustion; 
255 genes, 1,275 shRNAs divided into two pools), and the second on 
kinases/phosphatases (1,307 genes, 6,535 shRNAs divided into seven 
pools) (Table 1a). In these primary screens, each gene was represented 
by approximately five shRNAs (it is common that only one or two of 
such shRNAs have sufficient activity in pooled screens). We observed 
multiple distinct in vivo phenotypes. For certain genes, shRNAs were 
over-represented in all tested tissues compared to the starting T-cell 
population (for example, SHP-1), indicative of enhanced proliferation 
independent of TCR recognition of a tumour antigen. For other genes, 
there wasa selective loss of shRNAs within tumours (for example, ZAP-70, 
a critical kinase in the T-cell activation pathway). We focused our anal- 
ysis on genes whose shRNAs showed substantial over-representation 
in tumour but not spleen, a secondary lymphoid organ. Substantial 
T-cell accumulation in tumours was observed for a number of shRNAs, 
despite the immunosuppressive environment. For secondary screens, 
we created focused pools in which each candidate gene was represented 


Table 1 | Summary of primary and secondary shRNA screens 
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data: shRNA sequence reads from tumours, irrelevant (irLN) and draining 
lymph nodes (dLN) versus spleen. Upper row, sequence reads for all genes in a 
pool; lower row, individual genes (LacZ, negative control). Dashed lines 
indicate a deviation by log, from diagonal. 


by approximately 15 shRNAs. Primary data from this analysis are 
shown for three genes in Fig. 1b: LacZ (negative control), Cblb (an 
E3 ubiquitin ligase that induces T-cell receptor internalization)”* and 
Ppp2r2d (not previously studied in T cells). For both Ppp2r2d and Cblb, 
five shRNAs were substantially increased in tumours (red) compared 
to spleen, whereas no enrichment was observed for LacZ shRNAs. 
Overall, 43 genes met the following criteria: = fourfold enrichment 
for three or more shRNAs in tumours compared to spleen (Table la 
and Extended Data Fig. 1c, d). The set included gene products prev- 
iously identified as inhibitors of T-cell receptor signalling (including 
Cblb, Dgka, Dgkz, Ptpn2), as well as other well-known inhibitors of 
T-cell function (for example, Smad2, Socs1, Socs3, Egr2), validating our 
approach (Table 1b and Extended Data Table 1)”°-*". 


Target validation 

We next confirmed at a cellular level that these shRNAs induce T-cell 
accumulation in tumours. OT-I T cells were infected with lentiviral 
vectors driving expression of a single shRNA and a reporter protein 
(Thy1.1 or one of four different fluorescent proteins), and after seven 
days the frequency of shRNA-transduced T cells was quantified in 
tumours, spleens and lymph nodes by flow cytometry. When the control 
LacZ shRNA was expressed in CD8 T cells, the frequency of shRNA- 
expressing CD8 T cells was lower in tumours compared to spleen 
(~twofold). In contrast, experimental shRNAs induced accumulation 


a T-cell dysfunction Kinase/phosphatase shRNA enrichment in tumour 
First screen Genes 255 1,307 4-10-fold: 123 
shRNAs 1,275 6,535 10-20-fold: 17 
Candidate genes 32 82 >20-fold: 1 
Second screen Genes 32 43 4-10-fold: 191 
shRNAs 480 645 10-20-fold: 27 
Candidate genes 17 26 >20-fold: 1 
b Function Genes 


Inhibition of TCR signalling 
Phosphoinositol metabolism 
Inhibitory cytokine signalling pathways 
AMP signalling, Inhibition of mTOR 
Cell cycle 

Actin and microtubules 

Potential nuclear functions 

Role in cancer cells 


Cbib, Dgka, Dgkz, Fyn, Inpp5b, Popp3cc, Ptpn2, Stk17b, Tnk1 
Dgka, Dgkz, Impk, Inpp5b, Sbf1 

Smad2, Socs1, Socs3 

Entpd1, Prkab2, Nuak 

Cdkn2a, Pkd1, Ppp2r2d 

Arhgap5, Mast2, Rock1 

Blvrb, Egr2, Impk, Jun, Ppm1g 

Alk, Arhgap5, Eif2ak3, Hipk1, Met, Nuak, Pdzklip1, Rock1, Yes1 


a, T-cell dysfunction and kinase/phosphatase screens. Listed are numbers of genes, shRNAs in each gene set and identified candidate genes. Genes were considered positive in secondary screens when = 3 
shRNAs showed = fourfold enrichment in tumour relative to spleen. b, Functional classification of candidate genes from secondary screens. 
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of CD8 T cells in tumours but not in the spleen (Fig. 2a and Extended 
Data Fig. 2a). T-cell accumulation in tumours was more than tenfold 
relative to spleen for seven of these genes. The strongest phenotype was 
observed with shRNAs targeting Ppp2r2d, a regulatory subunit of the 
family of PP2A phosphatases”. A Ppp2r2d shRNA not only induced 
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Figure 2 | shRNA-driven accumulation of T cells in B16 melanoma. a, CD8 
(OT-I) T-cell enrichment in tumours relative to spleen (n = 3). b, Enrichment 
of Ppp2r2d-silenced CD8 (OT-I) or CD4 (TRP1) T cells (Thy1.1* cells) in 
tumour versus spleen. c, Reversal of shRNA-induced phenotype by Ppp2r2d 
cDNA with mutated shRNA binding site. NS, not significant. d, Quantitative 
PCR for Ppp2r2d mRNA in tumour-infiltrating OT-IT cells (day 7). e, Ppp2r2d 
protein quantification by mass spectrometry with labelled synthetic peptides 
(AQUA, ratio of endogenous to AQUA peptides). Representative data from 
two independent experiments (a-d); Two-sided student’s t-test, *P = 0.05, 
**D<().01; mean + s.d. 
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accumulation of OT-I CD8 T cells, but also CD4 T cells (from TRP-1 
TCR transgenic mice)**, with T-cell numbers in tumours being signifi- 
cantly higher when Ppp2r2d rather than LacZ shRNA was expressed 
(36.3-fold for CD8; 16.2-fold for CD4 T cells) (Fig. 2b). CD8 T-cell 
accumulation correlated with the degree of Ppp2r2d knockdown, and 
two Ppp2r2d shRNAs with the highest in vivo activity induced the 
lowest levels of Ppp2r2d messenger RNA (Extended Data Fig. 2b). 
Ppp2r2d knockdown was also confirmed at the protein level using a 
quantitative mass spectrometry approach (Fig. 2e). Ppp2r2d shRNA 
activity was specific because the phenotype was reversed when a Ppp2r2d 
complementary DNA (with wild-type protein sequence, but mutated 
DNA sequence at the shRNA binding site) was co-introduced with the 
Ppp2r2d shRNA (Fig. 2c and Extended Data Fig. 3). Furthermore, OT-I 
CD8 T cells overexpressed Ppp2r2d in tumours compared to spleen (in 
the absence of any shRNA expression), indicating that it is an intrinsic 
component of the signalling network inhibiting T-cell function in 
tumours (Fig. 2d). Microarray analysis of tumour-infiltrating T cells 
expressing different shRNAs showed that each shRNA induced a lar- 
gely distinct set of gene expression changes, indicating that improved 
T-cell function in tumours can be mediated through a number of 
different intracellular pathways (Extended Data Fig. 4). 


Cellular mechanisms for Ppp2r2d 


We next examined the cellular mechanisms driving T-cell accumula- 
tion by a Ppp2r2d shRNA in tumours, specifically T-cell infiltration, 
proliferation and apoptosis. T-cell infiltration into tumours was assessed 
by transfer of OT-I CD8 T cells labelled with a cytosolic dye (carboxy- 
fluorescein succinimidyl ester, CFSE). No differences were observed in 
the frequency of Ppp2r2d or LacZ shRNA-transduced CD8 T cells in 
tumours on day 1, indicating no substantial effect on T-cell infiltration 
(Fig. 3a). However, analysis of later time points (days 3-7) demon- 
strated a higher degree of proliferation (based on CFSE dilution) by 
Ppp2r2d compared to LacZ shRNA-transduced T cells (Fig. 3b and 
Extended Data Fig. 5a). The action of Ppp2r2d was downstream of 
T-cell receptor activation because T-cell proliferation was enhanced 
in tumours and to a lesser extent in tumour-draining lymph nodes 
(Extended Data Fig. 5a). In contrast, no proliferation was observed in 
irrelevant lymph nodes or the spleen where the relevant antigen was 
not presented to T cells (data not shown). Substantial T-cell prolifera- 
tion was even observed for LacZ shRNA-transduced T cells (complete 
dilution of CFSE dye by day 7), despite the presence of small numbers 
of such cells in tumours. This indicated that LacZ shRNA-transduced 
T cells were lost by apoptosis. Indeed, a larger percentage of tumour- 
infiltrating T cells were labelled with an antibody specific for active 
caspase 3 when the LacZ control shRNA (rather than Ppp2r2d shRNA) 
was expressed (Fig. 3c and Extended Data Fig. 5b). Furthermore, co- 
culture of CD8 T cells with B16-Ova tumour cells showed that the 
majority of LacZ shRNA-expressing T cells became apoptotic (65.7%), 
whereas most Ppp2r2d shRNA-transduced T cells were viable (89.5%, 
Fig. 3d). 

These results indicated the possibility that Ppp2r2d shRNA-transduced 
CD8 T cells may be able to proliferate and survive even when they 
recognize their antigen directly presented by B16-Ova tumour cells. 
This idea was tested by implantation of tumour cells into B2m ‘~ mice 
which are deficient in expression of MHC class I proteins™. In such mice, 
only tumour cells of the host, but not professional antigen-presenting 
cells, could present tumour antigens to T cells. Indeed, Ppp2r2d shRNA- 
transduced OT-I CD8 T cells showed massive accumulation within 
B16-Ova tumours in B2m ’~ mice (Fig. 3e) whereas very small numbers 
of T cells were present in contralateral B16 tumours that lacked expres- 
sion of the Ova antigen. Ppp2r2d-silenced T cells could therefore effec- 
tively proliferate and survive in response to tumour cells, despite a lack 
of suitable co-stimulatory signals and an inhibitory microenvironment. 

Ex vivo analysis of tumour-infiltrating T cells at a single-cell level 
using a nanowell device**”® also demonstrated that Ppp2r2d silencing 
increased cytokine production by T cells (Fig. 4a—c). T cells were activated 
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Figure 3 | Changes in T-cell function induced by Ppp2r2d shRNA. 

a, Tumour infiltration at 24h by CFSE-labelled OT-I T cells. b, Enhanced 
proliferation by Ppp2r2d-silenced T cells (CFSE dilution). c, d, Reduced 
apoptosis by Ppp2r2d-silenced OT-I T cell in tumours (c, activated caspase-3) 
or during 3-day co-culture with B16-Ova tumour cells (d, annexin V). 

e, Ppp2r2d-silencing induced T-cell expansion even when MHC class I 
expression was restricted to tumour cells; T-cell transfer into C57BL/6 or 
B2m ‘~ mice with B16-Ova tumours. Data representative of two independent 
trials (n = 3; **P < 0.01, two-sided student’s t-test); mean + s.d. 


for 3h by CD3/CD28 antibodies on lipid bilayers, followed by 1h cyto- 
kine capture on antibody-coated slides. CD8 T cells showed a higher secre- 
tion rate for interferon-y, interleukin-2 and granulocyte-macrophage 
colony-stimulating factor (IFN-y, IL-2 and GM-CSF, respectively) and 
a larger fraction of T cells secreted more than one cytokine (Fig. 4b, c). 
The presence of larger numbers of IFN-y-producing T cells was con- 
firmed by intracellular cytokine staining (Fig. 4d and Extended Data 
Fig. 5c). 

PP2A represents a family of phosphatase complexes composed of 
catalytic, scaffolding and regulatory subunits. Cellular localization and 
substrate specificity are determined by one of many regulatory sub- 
units, of which Ppp2r2d is a member*. Ppp2r2d directs PP2A to Cdk1 
substrates during interphase and anaphase; it thereby inhibits entry 
into mitosis and induces exit from mitosis’. PP2A also has a gatekeeper 
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Figure 4 | Cytokine secretion by gene-silenced tumour-infiltrating T cells. 
a-c, Ex vivo analysis of cytokine production by tumour-infiltrating OT-I T cells 
at a single-cell level using a nanowell device (84,672 wells of picolitre volume). 
a, Representative single cells in nanowells and corresponding patterns of 
cytokine secretion. b, Percentage of T cells secreting indicated cytokines. 

c, Cytokine secretion rates calculated from standard curves (mean + s.d., 

*P < 0.05, Mann-Whitney U-test). d, Intracellular IFN-y staining for tumour- 
infiltrating Ppp2r2d-silenced T cells, representative of two independent 
experiments (1 = 3, **P < 0.01, two-sided student’s t-test); mean + s.d. 


role for BAD-mediated apoptosis. Phosphorylated BAD is sequestered 
in its inactive form in the cytosol by 14-3-3, whereas dephosphorylated 
BAD is targeted to mitochondria where it causes cell death by binding 
Bcl-X;, and Bcl-2**. PP2A phosphatases have also been shown to interact 
with the cytoplasmic domains of CD28 and CTLA-4 as well as Carmal 
(upstream of the NF-«B pathway)”, but it is not known which regu- 
latory subunits are required for these activities. Anti-Ppp2r2d antibodies 
suitable for the required biochemical studies are not currently available. 


Enhanced anti-tumour immunity 
Finally, we assessed the ability of a Ppp2r2d shRNA to enhance the 
efficacy of adoptive T-cell therapy. B16-Ova tumour cells (2 X 10°) 
were injected subcutaneously into B6 mice. On day 12, mice bearing 
tumours of similar size were divided into seven groups, either receiv- 
ing no T cells, 2 x 10° shRNA-transduced TRP-1 CD4 T cells, 2 X 10° 
shRNA-infected OT-I CD8 T cells, or both CD4 and CD8 T cells (days 
12 and 17). The modest anti-tumour activity of OT-I CD8 T cells 
(expressing the control LacZ shRNA) is consistent with published 
data*'. Ppp2r2d-silencing improved the therapeutic activity of both 
CD4 and CD8 T cells (Fig. 5a, b). A Ppp2r2d shRNA also enhanced 
anti-tumour responses when introduced into T cells specific for the 
endogenous melanoma antigens gp100 (pmel-1 CD8 T cells) and 
TRP-1 (TRP-1 CD4 T cells) (Fig. 5c). gp100 is a relevant antigen in 
human melanoma, and a clinical trial in which a gp100-specific TCR 
(isolated from HLA-A2 transgenic mice) was introduced into peripheral 
blood T cells demonstrated therapeutic benefit in a subset of patients*. 
Ppp2r2d-silenced T cells acquired an effector phenotype in tumours 
(Extended Data Fig. 6a) and >30% of the cells expressed granzyme B 
(Extended Data Fig. 7a). Consistent with greatly increased numbers of 
such effector T cells in tumours (Extended Data Fig. 7b), terminal deoxy- 
nucleotidyl transferase dUTP nick end labelling (TUNEL) demon- 
strated increased apoptosis in tumours when Ppp2r2d rather than 
LacZ shRNA-expressing T cells were present (Extended Data Fig. 
7c). B16 melanomas are highly aggressive tumours in part because 
MHC class I expression is very low. Interestingly, Ppp2r2d but not 
LacZ shRNA-expressing T cells significantly increased MHC class I 
expression (H-2K°) by tumour cells (Extended Data Fig. 7d), possibly 
due to the observed increase in IFN-y secretion by T cells (Fig. 4b-d). A 
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Ppp2r2d shRNA did not reduce expression of inhibitory PD-1 or LAG- 
3 receptors on tumour-infiltrating T cells, demonstrating that its mech- 
anism of action is distinct from these known negative regulators of T-cell 
function (Extended Data Fig. 6b). This finding suggests combination 
approaches targeting these intracellular and cell surface molecules. 


Discussion 


These results establish the feasibility of in vivo discovery of novel targets 
for immunotherapy in complex tissue microenvironments. We show 
that it is possible to discover genes with differential action across tissues, 
as exemplified by T-cell accumulation in tumours compared to second- 
ary lymphoid organs. For genes with tissue-selective action, T-cell pro- 
liferation and survival are likely to be under the control of the T-cell 
receptor and therefore do not occur in tissues lacking presentation of a 
relevant antigen. Many variations of the approach presented here can 
be envisioned to investigate control of particular immune cell functions 
in vivo. For example, fluorescent reporters for expression of cytokines 
or cytotoxic molecules (granzyme B, perforin) could be integrated into 
our approach to discover genes that control critical T-cell effector func- 
tions in tumours. 

Targeting of key regulatory switches may offer new approaches to 
modify the activity of T cells in cancer and other pathologies. For 
example, recent clinical trials have shown that transfer of genetically 
modified T cells can result in substantial anti-tumour activity? *°. The 
efficacy of such T-cell-based therapies could be enhanced by shRNA- 
mediated silencing of genes that inhibit T-cell function in the tumour 
microenvironment. 


METHODS SUMMARY 


In vivo shRNA screening. Nine shRNA pools (approximately 5 shRNAs per 
gene) were created and subcloned into the pLKO-Thy1.1 lentiviral vector. Each 
pool also included 85 negative-control shRNAs. OT-I T cells were cultured with 
IL-7 (Sng ml ") and IL-15 (100 ng ml '); on day 2 cells were spin-infected with 
lentiviral pools supplemented with protamine sulphate (5 jig ml ') in RetroNectin- 
coated 24-well plates (5 jig ml) at a multiplicity of infection (MOI) of 15. Following 
infection, OT-1 cells were cultured with IL-7 (2.5 ng ml ~ 1) IL-15 (50 ngml- 1) and 
IL-2 (2ngml~'). On day 5, shRNA-transduced T cells were enriched by positive 
selection using the Thyl.1 surface reporter (StemCell Technologies). T cells 
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(5 X 10°) were injected intravenously into C57BL/6 mice bearing day 14 B16- 
Ova tumours (15 mice per shRNA pool). Seven days later, shRNA-expressing 
T cells (CD8* Vo.2* VB5* Thy1.1*) were isolated by flow cytometry from tumours, 
spleens, tumour-draining lymph nodes and irrelevant lymph nodes. Genomic DNA 
was purified (Qiagen) and deep-sequencing templates were generated by PCR amp- 
lification of the shRNA cassette. Representation of shRNAs in each pool was ana- 
lysed by deep sequencing using an Illumina Genome Analyzer’. 

Secondary screens were performed using focused pools containing approxi- 
mately 15 shRNAs per gene as well as 85 negative controls. Cut-offin the secondary 
screen was defined as = 3 shRNAs with = fourfold enrichment in tumour relative 
to spleen. Screening results were validated at a cellular level by introducing indi- 
vidual shRNAs into T cells, along with a reporter protein (green, teal, red or 
ametrine fluorescent proteins, Thy1.1). This approach enabled simultaneous test- 
ing of five shRNAs in an animal (three mice per group). Proliferation of shRNA- 
transduced T cells was visualized on the basis of CFSE dilution after 24h as well as 
3, 5 and 7 days. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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An environmental bacterial taxon with a 
large and distinct metabolic repertoire 
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Alexander O. Brachmann', Cristian Gurgui’, Toshiyuki Wakimoto’, Matthias Kracht?, Max Criisemann’, Ute Hentschel®, 
Ikuro Abe’, Shigeki Matsunaga’, Jorn Kalinowski*, Haruko Takeyama? & Jorn Piel! 


Cultivated bacteria such as actinomycetes are a highly useful source of biomedically important natural products. However, 
such ‘talented’ producers represent only a minute fraction of the entire, mostly uncultivated, prokaryotic diversity. The 
uncultured majority is generally perceived as a large, untapped resource of new drug candidates, but so far it is unknown 
whether taxa containing talented bacteria indeed exist. Here we report the single-cell- and metagenomics-based dis- 
covery of such producers. Two phylotypes of the candidate genus ‘Entotheonella’ with genomes of greater than 9 
megabases and multiple, distinct biosynthetic gene clusters co-inhabit the chemically and microbially rich marine 
sponge Theonella swinhoei. Almost all bioactive polyketides and peptides known from this animal were attributed to a 
single phylotype. ‘Entotheonella’ spp. are widely distributed in sponges and belong to an environmental taxon proposed 
here as candidate phylum ‘Tectomicrobia’. The pronounced bioactivities and chemical uniqueness of ‘Entotheonella’ 
compounds provide significant opportunities for ecological studies and drug discovery. 


More than half of the known natural products with antimicrobial, anti- 
tumour or antiviral activity are of bacterial origin’. Most of these com- 
pounds were isolated from cultivated representatives of only five bacterial 
groups: filamentous actinomycetes, Myxobacteria, Cyanobacteria, and 
members of the genera Pseudomonas and Bacillus. Uncultivated bac- 
teria, which are proposed to form 70% of all known prokaryotic phyla’, 
represent a particularly promising source for new, chemically prolific 
taxa. However, except for individual biosynthetic pathways reported 
from environmental sources**, the true metabolic potential of these 
microbes remains unexplored. Two such pathways, involved in the 
production of onnamide- and theopederin-type polyketides® and ribo- 
somal peptides of the polytheonamide group® (Fig. 1), were previously 
discovered in the marine sponge Theonella swinhoei. Like many other 
sponges, this animal harbours a massive consortium of uncultivated 
bacteria belonging to hundreds of distinct phylotypes’”. T. swinhoei is 
the source of exceptionally diverse natural products and forms distinct 
chemotypes; samples of the sponge collected from different locations 
have largely non-overlapping metabolite profiles. From the onnamide 
and polytheonamide chemotype occurring at Hachijo Jima, Japan, here 
termed T. swinhoei Y (Y referring to the yellow interior of the sponge), 
in total more than 40 bioactive polyketides and modified peptides 
belonging to seven structural classes were isolated (Fig. 1)'°. As previous 
work on onnamides and polytheonamides has produced only meta- 
genomic DNA fragments lacking taxonomically diagnostic features, it 
was unknown which members of the bacterial community are the pro- 
ducers of these compounds. 


Attribution of metabolic genes to ‘Entotheonella’ 


Single-cell analysis has recently emerged as an efficient strategy to cor- 
relate the phylogenetic identity of environmental microorganisms with 


their functional gene repertoire’! ’. To pinpoint producers in T. swinhoei 
Y, samples enriched in bacteria of different cell densities were prepared 
by differential centrifugation after sponge collection. When a fraction 
of higher density (Fig. 2a) was microscopically examined, we found that 
it contained a highly enriched population of large filamentous bacteria 
that fluoresce when excited with ultraviolet light (Fig. 2b). The bacteria 
were morphologically similar to the symbiont “Candidatus Entotheo- 
nella palauensis’ previously reported from a Palauan Theonella swinhoei 
chemotype and suspected as producer of antifungal peptides'*”*. Scan- 
ning electron micrographs (Fig. 2c) revealed the presence of approxi- 
mately 2- to 3-1m cells linked to each other. These bacteria, as well as 
those from the low-density fraction containing mostly unicellular bac- 
teria, were sorted individually into 96-well plates by fluorescence-assisted 
cell sorting (FACS) (Extended Data Fig. 1a), resulting in filamentous 
and unicellular plates. Subsequently, multiple displacement amplifica- 
tion (MDA) of single bacterial genomes was performed on each well, 
resulting in DNA product sizes of approximately 10 kb (Extended Data 
Fig. 1b). 

To detect wells containing DNA from the onnamide or polytheo- 
namide producer, primers specific for onn and poy genes encoding the 
respective pathways were used in diagnostic PCRs. For both gene clus- 
ters, a large number of positive wells were detected among the filamen- 
tous plates (Fig. 2d and Extended Data Fig. 1c). Subsequent PCRs with 
eubacterial and ‘Entotheonella’-specific 16S ribosomal RNA gene pri- 
mers showed that about half of the wells contained DNA originating 
from ‘Entotheonella phylotypes. Overall, from 48 wells of an analysed 
filamentous plate, 22 wells were positive for the onnamide, 34 for the 
polytheonamide, and 27 for the ‘Entotheonella’ sp. 16S rRNA gene, as 
confirmed by sequencing of each amplicon. Sixteen of the positive wells 
showed amplification for all three of the onn, poy and ‘Entotheonella’ sp. 
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Figure 1 . Representative bioactive natural product families isolated from 
the sponge Theonella swinhoei. Polytheonamides A and B differ in the 


primer pairs in one or more out of three repetitive PCRs (Fig. 2d). 
For further analysis, wells positive for all three primer sets were sub- 
jected to PCR using eubacterial 16S rRNA gene primers. 16S rRNA 
genes from ‘Entotheonella’ sp. as well as Escherichia coli were identified. 
The E. coli amplicon was discarded, as it was also identified in MDA- 
treated wells that only contained water. Thus, the data suggested 
‘Entotheonella as the source of both the onnamide-type compounds 
and polytheonamides. 


Two chemically distinct ‘Entotheonella’ symbionts 


As not all wells were positive for all three primer pairs and bacteria 
might have been overlooked owing to incomplete genome coverage 
during MDA”, we wished to validate further our results by metage- 
nomic sequencing. On the filamentous bacterial cell sample, several 
rounds of Illumina, 454, PacBio, and Sanger sequencing were performed 
(Supplementary Table 1). Of the sequencing reads, 78.3% assembled to 
longer contigs, resulting in 18,093 contigs of at least 500 bp. The remain- 
ing reads did not show significant overlap, suggesting that the cor- 
responding phylotypes were present only at low concentration. This 
hypothesis was backed by the observation of a high variance in cov- 
erage, ranging between 3.3- and 1,564.7-fold for contigs of at least 2 kb 
length. Basic Local Alignment Search Tool X (BLASTX) analysis of the 
contig and scaffold sequences followed by binning based on sequence 
depth and G + C content revealed two large populations of bacterial 
DNA with a G + C content around 55% (Supplementary Table 2). A 
third set of low coverage and low G + C contigs delivered hits against 
various eukaryotic genomes and was therefore excluded from further 
analyses (Extended Data Fig. 2a). A more detailed analysis of the fil- 
tered data set revealed for most bacterial genes the existence of two 
highly similar versions (approximately 85-91% nucleotide identity) 
that resided in virtually syntenous genomic environments encompass- 
ing over 4.5 Mb (Extended Data Fig. 3). The overlapping genomic 
regions included exactly two orthologues of 35 single-copy genes often 
used as bacterial phylogenic markers (Supplementary Table 3)’”. These 
features suggested that the large majority of assembled bacterial sequences 
belonged to two closely related ‘Entotheonella variants, termed TSY1 
and TSY2, with 97.6% identical 16S rRNA gene sequences and an ave- 
rage G + C content of 55.79% (Extended Data Fig. 2b). The identity of 
the 16S rRNA genes to that of “E. palauensis’ was about 97%. Depth 
analysis also suggested the presence of about 236 kb of DNA belonging 
to at least one large plasmid (G + C content: 55.11%). Coverage was 
60.3-, 24.5- and 278.5-fold for the TSY1 and TSY2 chromosomes and 
the plasmid(s), respectively (corresponding to a ratio of 1:0.4:5), indi- 
cating that TSY1 is the dominant strain (Extended Data Fig. 2b). Both 
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stereochemistry of the sulphoxide moiety (polytheonamide A shows 
S chirality; polytheonamide B shows R chirality). 


strains possess genomes of similar size that exceed 9 Mb, thus belong- 
ing to the largest known prokaryotic genomes (Supplementary Table 
2). A remarkably large number of repetitive elements, some present in 
about 25 to 100 copies, as well as the high degree of similarity of the 
two genomes prohibited further assembly. To determine completeness 
of genomes, a core gene group analysis’* was performed, identifying 62 
of 66 core groups for both TSY1 and TSY2. Thus we assume that the 
protein inventory of both strains was almost completely established. 
The search for metabolic genes in this data set revealed complete 
sets of onn and poy genes on the plasmid-derived contigs. In addition, 
an unexpectedly high number of further gene clusters for polyketide 
and ribosomal or non-ribosomal peptide biosynthesis were identified 
on the chromosomal sequences. To allow for prediction of the cor- 
responding metabolites, sequence gaps within most of these loci were 
filled by paired-end sequencing of 3- and 8-kb libraries and by com- 
binatorial or targeted PCR, resulting in at least 28 biosynthetic gene 
clusters on 31 scaffolds (Extended Data Fig. 4 and Supplementary 
Table 4). For many non-ribosomal peptide synthetase (NRPS) clusters, 
bioinformatic predictions based on enzyme colinearity rules’, substrate 
recognition motifs”””, and the presence of genes for non-proteinogenic 
amino acid biosynthesis (Supplementary Table 5), revealed known bio- 
active peptides from Japanese T. swinhoei as the best structural hits. 
Specifically, we identified virtually perfect matches for the cyclotheo- 
namides, keramamides and nazumamide A. In addition, we identified 
a konbamide A-type” cluster in which five of the six NRPS modules are 
present and colinear with the compound structure, but two ORF inser- 
tions disrupt the NRPS architecture, suggesting that the cluster is an 
inactive evolutionary relic. Consistent with this, members of the onna- 
mide, polytheonamide, keramamide, and cyclotheonamide compound 
families were detected using high-resolution mass spectroscopy (HRMS) 
in extracts of our sponge specimens and enriched filamentous cell frac- 
tions, but we were unable to detect the konbamides (Supplementary 
Tables 6 and 7, and Extended Data Fig. 5). Taking together the combined 
bioinformatic and chemical analyses, candidate gene clusters existed 
for all known peptide and polyketide families including onnamides 
and polytheonamides, except for the aurantosides. In addition to these 
attributable genes, loci for at least 14 peptides of unknown identity 
were found (Extended Data Fig. 4). Notably, this also includes four further 
gene clusters for proteusins, a recently discovered new natural product 
family with polytheonamides as the only members known to date®*. 
Tandom mass spectrometry (MS-MS)-based molecular networking” 
suggested a high diversity of previously unknown metabolite families, 
indicating that at least some of these orphan pathways are likely to be 
active (Extended Data Fig. 6). The gene candidates for konbamides 
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Figure 2 | Single-cell analytic studies. a, Differential interference contrast 
micrograph of the filamentous fraction after differential centrifugation (n = 3). 
b, Fluorescence micrograph of filamentous bacteria without (top) and with 
(bottom) ultraviolet excitation (n = 3). c, Scanning electron micrograph of a 
single filamentous bacterium (n = 3). d, Nested PCR of natural product gene 
clusters from whole-genome amplification samples of wells sorted with 
single filaments (n = 48). Wells showing positive amplification for 
‘Entotheonella’ sp. 16S rRNA gene, onnamide (onn) and polytheonamide (poy) 
gene clusters (1 = 3) were used for the identification of the other enzyme 
clusters (n = 1). Each lane represents a single well defined by the well identifier 
above the top row. cth, cyclotheonamide; ker, keramamide; kon, konbamide; 
naz, nazumamide. PC, metagenomic DNA from filamentous fraction; 

pts, unknown proteusin. 


(kon), keramamides (ker), nazumamide A (naz), and an unknown 
non-ribosomal peptide formed a supercluster of 129 kb. The binning 
data suggested that this region, the putative cyclotheonamide (cth), 
and the two unassigned proteusin loci all belong to the chromosome of 
the dominant ‘Entotheonella’ sp. TSY1 (Extended Data Fig. 2b). The 
chromosome of TSY2 contained fewer (at least seven) metabolic gene 
clusters (two polyketide, at least two NRPS, and a further proteusin 
cluster) that could not be assigned to known compounds. Except for a 
small NRPS and a type III polyketide synthase (PKS) system present in 
both genomes, there was no overlap in the natural product gene rep- 
ertoires of TSY1 and TSY2, indicating that significant chemical varia- 
tion exists among members of ‘Entotheonella’, even within the same 
sponge individual. To validate further the source of the plasmid-based 
polytheonamide and onnamide genes, we conducted additional single- 
cell experiments (Fig. 2d). All MDA samples previously analysed pos- 
itive for onn and poy genes were tested again with PCR primers for 
various genes of the kon, naz, cth, ker and one unknown proteusin 
pathway. For all cases, positive wells were identified, suggesting that TSY1 
carries the plasmid and produces the entire set of metabolites. 
Functional evidence for the identity of ‘Entotheonella’ gene clusters 
was obtained by biochemically characterizing gene products from several 
pathways. Two selected NRPS adenylation domains encoded within the 
putative cth and ker pathways were overproduced in E. coli and analysed 
using a y-'*O,-ATP pyrophosphate exchange assay” to investigate their 
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amino acid substrate specificity (Extended Data Fig. 7). For the cth 
NRPS, the adenylation domain of module 2 (CthA2) exhibited high 
selectivity for the rare amino acid 2,3-diaminopropionate (DAP), con- 
sistent with the cyclotheonamide structure (Extended Data Fig. 7). The 
incorporation of this building block is also supported by the presence 
of two genes in the cluster that encode homologues of SbnAB-type 
DAP synthases”’. KerA5 showed greatest substrate specificity for Leu, in 
agreement with known keramamides and the bioinformatic prediction 
(Extended Data Fig. 7). Thus, taking the colinearity rule of NRPSs into 
account, the data support the proposed function of these gene clusters. 
We also obtained functional support for a biosynthetic role of the 
unknown proteusin pathway TSY1_14 by co-expressing the putative 
nitrile-hydratase-like precursor peptide with a predicted lanthionine 
synthetase encoded directly adjacent to the precursor gene. Up to three 
dehydrations of the core peptide were observed by HRMS for the co- 
expression product compared to the unmodified peptide from expres- 
sion of the precursor peptide alone. Subsequent alkylation of reduced 
cysteine residues and tandem MS-MS indicated for each dehydration, 
one lanthionine bridge was formed within the predicted core pep- 
tide (Extended Data Fig. 8 and Supplementary Table 8). These experi- 
ments demonstrated that the putative proteusin gene cluster TSY1_14 
encodes a functional precursor peptide and modifying lanthionine 
synthetase. Considering the high complexity of the sponge microbiome, 
which contains hundreds of ribotypes, the accumulation of metabolic 
genes in two variants of ‘Entotheonella is remarkable. Owing to the 
extraordinary biosynthetic repertoire, we propose the name ‘Candidatus 
Entotheonella factor’ (latin, factor; the producer) for these bacteria. 


‘Entotheonella’ species are ubiquitous 


These findings raised the question whether ‘Entotheonella’ spp. also 
inhabit other sponges and could have a general role in natural product 
biosynthesis. It was previously shown that an enriched fraction of ‘E. 
palauensis’ from a Palauan chemotype of T. swinhoei contained high 
concentrations of the hybrid polyketide-peptide theopalauamide’*"’. 
“Entotheonella’ members were also detected in another lithistid sponge, 
Discodermia dissoluta, that contains the anticancer polyketide disco- 
dermolide**. To analyse the distribution of ‘Entotheonella’ spp. in depth, 
37 taxonomically diverse sponge species collected at 20 locations (Sup- 
plementary Table 9) were tested by PCR based on conserved, unique 
regions of ‘Entotheonella’ 16S rRNA genes. Of the 37 sponges, 28 yielded 
amplicons with sequences exhibiting 95.5-99.9% nucleotide identity 
to the homologues of ‘E. factor’ (Extended Data Fig. 9a, b). Thus, 
‘Entotheonella’ spp. seem to be widely distributed in marine sponges 
from distant geographical regions. ‘Entotheonella’ amplicons were also 
obtained from various seawater samples; however, contamination from 
sponges growing nearby cannot be excluded. For further insights into 
the discovery potential and chemical variability of these bacteria, we 
initiated studies on another chemotype of T. swinhoei (type W1, refer- 
ring to the white sponge interior) that contains the actin inhibitor mis- 
akinolide A (Extended Data Fig. 10b), a complex polyketide not present 
in the Y chemotype. PCR detection of PKS genes using the total sponge 
DNA generated exclusively amplicons that were phylogenetically attrib- 
uted to the sup type (Extended Data Fig. 10c), a putative fatty acid synthase 
that is widespread and dominant in most sponge microbial consortia 
and not involved in the production of complex, bioactive polyketides”. 
In contrast, a highly enriched ‘Entotheonella’ fraction (Extended Data 
Fig. 10a) prepared from this sponge yielded a completely different set 
of amplicons consisting of six gene fragments all belonging to PKSs 
associated with complex polyketide production (Extended Data Fig. 10c). 
None of these had a close homologue in TSY1 or TSY2, thus further 
supporting a diverse chemistry of ‘Entotheonella phylotypes. 


A new candidate phylum, ‘Tectomicrobia’ 

To obtain insights into the taxonomic position of ‘Entotheonella’, an 
initial 16S rRNA-based phylogenetic analysis was conducted (Extended 
Data Fig. 9c). Altogether, 243 16S rRNA gene sequences were analysed 
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from marine sponges in this study and from public databases. As the 
16S rRNA sequences were only 82% identical to representatives from 
known bacterial phyla and form a well-separated clade, we suggest the 
status of a new candidate phylum”®. The name “Tectomicrobia’ (latin, 
tegere; to hide, to protect) was chosen to reflect their uncultured status 
as well as the capability to produce bioactive compounds that are likely 
used as chemical defence. The closest relatives to “Tectomicrobia’ are 
Nitrospina spp., which were recently proposed to belong to a new phy- 
lum, Nitrospinae*'. The known sequences belonging to “Tectomicrobia’ 
comprise at least three discrete phylogenetic clades. The largest encom- 
passes all ‘Entotheonella’ sequences sensu stricto, which were largely 
recovered from marine sponges but also seawater (138 sequences total, 
of which 107 sequences were produced in this study), a second clade 
includes related, non-‘Entotheonella 16S rRNA gene sequences from 
various marine sponges (36 sequences), and a third group contains 16S 
rRNA gene sequences from terrestrial soils (18 sequences). For further 
validation of the phylogenetic data, we calculated trees using up to 38 
concatenated, universally conserved single-copy marker proteins’” of 
TSY1, TSY2, and 2,509 bacterial and archaeal taxa to determine the 
position of “Entotheonella’ in the tree of life. Recalculations with data 
sets from closely affiliated phyla (Fig. 3) supported ‘Entotheonella’ as 
belonging to a new sister phylum to Nitrospinae, in agreement with the 
16S rRNA data. 


Conclusions 


Owing to the high frequency of structurally distinct, bioactive meta- 
bolites in sponges, these animals have an important role in drug 
discovery. Compound localization studies suggested Bacteria as pro- 
ducers of individual metabolites‘*******, but remained ambiguous 
owing to the possibility of sequestration or transport. The true source 
of sponge natural products has therefore been a long-standing and, 
with the exception of metagenomic data providing kingdom-level 
information***, unanswered question. Here we provide evidence that 
a single member of the highly diverse microbiome of T. swinhoei Y, ‘E. 
factor TSY1’, is the source of almost all polyketides and peptides that 
have been isolated from its sponge host. The bioinformatic assignment 
to known compounds is further supported by functional studies for 
polytheonamides®, onnamide-type compounds**”**, keramamides, cyc- 
lotheonamides and an orphan proteusin. Our data on TSY1, TSY2, and 
a highly enriched ‘Entotheonella’ preparation from a second T. swin- 
hoei chemotype, indicate that members of this candidate genus contain 
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phyla. RAxML inference of 991 taxa with 100 bootstrap iterations based 
on up to 38 marker genes. Sequences are collapsed on the phylum level 
and the number of collapsed sequences is shown for each clade. The two 
“Tectomicrobia’ variants TSY1 and TSY2 are highlighted in bold. Bootstrap 
support values of equal or greater than 70% are shown for each node. The 
scale bar represents 10% estimated sequence divergence. PV-1 and 3-11 are 
strain names; OP8 is the former name of the (then candidate) phylum 
Aminicenantes. 


ARTICLE 


producers with a rich and, so far, unique secondary metabolism. Reports 
on ‘Entotheonella’ spp. from two other chemically rich sponges'**’ 
and our detection of these bacteria in many additional species hint at 
their more widespread role in the chemistry of their hosts. This study 
adds the first uncultivated prokaryotes to the taxonomically limited 
canon of metabolically talented bacteria. “Entotheonella spp. exhibit 
interesting parallels to streptomycetes and some other well-known 
producer groups**; for example, expanded genome size, biosynthetic 
superclusters* and multiple modular assembly lines, high metabolic 
variability among closely related organisms, and complex morphology. 
For ‘Entotheonella’ spp., complex morphology is particularly note- 
worthy, as it affords attractive opportunities to systematically study 
chemical interactions in marine symbioses and to exploit uncultivated 
bacteria in a targeted way for drug discovery. 


METHODS SUMMARY 


An adapted differential centrifugation protocol'* was used to sediment filament- 
ous and unicellular bacteria from the sponge tissue. Single bacteria cells and fila- 
ments were sorted into micro-titre plates by flow cytometry with a BD FACSAria 
II cell sorter (BD Biosciences). Genomic DNA was amplified using an Illustra 
Genomiphi V2 DNA Amplification Kit (GE Healthcare) and subjected to PCR 
analysis. Sequence information was obtained using the GS-FLX (454) and MiSeq 
(Illumina) platforms, using whole-genome sequencing and long mate-pair librar- 
ies. Additional sequence reads were obtained by PacBio sequencing (GATC) and 
Sanger sequencing (IIT). Reads were assembled using the Newbler (v2.6) de novo 
assembler. Automated annotation was performed with Rapid Annotation and 
Subsystem Technology (RAST) and validated manually. PKS and NRPS domain 
architecture and substrate specificities were based on sequence alignments and 
prediction-based software****°. Adenylation domains overexpressed in E. coli were 
characterized using a y-'*O,-ATP pyrophosphate exchange assay as previously 
described”*. The TSY1_14 proteusin precursor peptide was overexpressed in E. coli 
with and without the putative modifying LanM-like lanthionine synthetase from 
the same gene cluster. The resulting peptide products were analysed by liquid 
chromatography (LC)-electrospray ionization (ESI)-HRMS after TCEP (tris-(2- 
carboxyethyl)-phosphine) treatment, tryptic digest and derivatization. Extracts 
of T. swinhoei and enriched ‘Entotheonella were analysed by ultra-performance 
liquid chromatography (UPLC) and nano-LC heated ESI (HESI)-HRMS followed 
by eMZed” data analysis and molecular networking”. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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A cosmic web filament revealed in Lyman-a 
emission around a luminous high-redshift quasar 


Sebastiano Cantalupo??, Fabrizio Arrigoni- Battaia’, J. Xavier Prochaska’, Joseph F. Hennawi? & Piero Madau! 


Simulations of structure formation in the Universe predict that 
galaxies are embedded in a ‘cosmic web”, where most baryons reside 
as rarefied and highly ionized gas*. This material has been studied 
for decades in absorption against background sources’, but the sparse- 
ness of these inherently one-dimensional probes preclude direct 
constraints on the three-dimensional morphology of the under- 
lying web. Here we report observations of a cosmic web filament in 
Lyman-a emission, discovered during a survey for cosmic gas fluo- 
rescently illuminated by bright quasars*” at redshift z ~ 2.3. Witha 
linear projected size of approximately 460 physical kiloparsecs, the 
Lyman-@ emission surrounding the radio-quiet quasar UM 287 
extends well beyond the virial radius of any plausible associated dark- 
matter halo and therefore traces intergalactic gas. The estimated 
cold gas mass of the filament from the observed emission—about 
10'?° =°°/C'”? solar masses, where Cis the gas clumping factor—is 
more than ten times larger than what is typically found in cosmolog- 
ical simulations”*, suggesting that a population of intergalactic gas 
clumps with subkiloparsec sizes may be missing in current numer- 
ical models. 

A recent pilot survey’ using a custom-built, narrow-band filter on 
the Very Large Telescope demonstrated that bright quasars can, like a 
flashlight, ‘illuminate’ the densest knots in the surrounding cosmic web 
and boost fluorescent Lyman-« emission**”° to detectable levels. Follow- 
ing the same experiment, we imaged UM 287 on 2012 November 12 
and 13 uT with a custom narrow-band filter (NB3985) tuned to Lyman 
a at z = 2.28 inserted into the camera of the Low Resolution Imaging 
Spectrometer (LRIS) on the 10-m Keck I telescope (see Extended Data 
Fig. 1). Figure 1 presents the processed and combined images, centred 
on UM 287. In the NB3985 image, we identify a very extended nebula 
originating near the quasar with a projected size of about 1 arcmin. In 
the broad-band images no extended emission is observed. This requires 
the narrow-band light to be line-emission, and we identify it as Lyman 
o at the redshift of UM 287. 

Figure 2 presents the NB3985 image, continuum subtracted using 
standard techniques (see Methods) and smoothed with a 1-arcsec 
Gaussian kernel. This image is dominated by the filamentary and asym- 
metric nebula that has a maximum projected extent of 55 arcsec as 
defined by the 10'S ergs 'cm “arcsec * isophotal contour, corres- 
ponding to about 460 physical kpc or 1.5 Mpc in co-moving coordi- 
nates. Including (excluding) the emission from UM 287 falling within 
the narrow-band filter, the structure has a total line luminosity 
Lyyo = (1.43 + 0.05) X 10” ergs! (Lryo, = (2.2 + 0.2) X 10“*ergs '). 

Although Lyman- nebulae extending up to about 250 kpc have 
been previously detected’*"*, the UM 287 nebula represents a system 
that is unique so far: given its size, it extends well beyond any plausible 
dark-matter halo associated with UM 287 (see below), representing an 
exceptional example of emitting gas on intergalactic scales. 

The largest Lyman-« nebulae previously discovered (see Fig. 3) are 
associated with the most massive dark-matter haloes present in the 
high-redshift Universe. High-redshift radio galaxies (HzRGs), inferred 
to host obscured but luminous active galactic nuclei (AGN)’’”’, are 


often surrounded by giant Lyman-« envelopes extending up to about 
250 kpc at z ~ 3 (ref. 15). Clustering arguments and the observation of 
large overdensities of Lyman-o galaxies, together with the lack of X-ray 
detection from a possible intracluster medium, suggest that HzRGs are 
associated with haloes of 10'* solar masses (Mo)!*!”. With a virial 
diameter of about 300 kpc at z ~ 3, these haloes are therefore able to 
contain the largest HzRG Lyman-o nebulae. Blind narrow-band sur- 
veys have derived an apparently different population of large nebulae 
(termed Lyman-o blobs) with sizes extending up to 180 physical kpc at 
z =~ 3 that, in some cases, do not appear to be associated with a par- 
ticular bright galaxy or AGN’*"*'*”. The rarity and the strong clustering 
of these sources, suggest, as for HzRGs, an association with proto-cluster 
environments and haloes with masses of about 10’*Mo (refs 20, 21). 
Although the detailed origin of the emission of the Lyman-o blobs is 
still unclear, the sizes of the associated haloes strongly suggest that the 
emitting gas is confined within the halo itself. This is also the case for 
the Lyman-« nebulae previously detected around a small number of 
bright quasars, extending up to about 100 kpc (refs 10, 22-24). Clustering 
studies demonstrate that bright quasars at z < 3 populate haloes of mass 
~10'?°Mo (that have a virial diameter of about 280 kpc at z~ 2.3) 
independently of their redshift or luminosity”>’®. 

The exceptional nature of the nebula is due not only to its size (about 
460 physical kpc) but also to the fact that it is associated with a radio- 
quiet quasar. Radio-quiet quasars have the smallest host halo mass 
(~10'*°M5) and virial diameter (280 kpc) among previously detected 
objects and do not have radio-emitting jets that may power Lyman-o 
emission on large scales’’. In order for the nebula to be fully contained 
within the virial radius of a dark-matter halo centred on UM 287, a 
halo mass would be required that is at least ten times larger than the 
typical value associated with radio-quiet quasars. This would make the 
host halo of UM 287 one of the largest known at z > 2, a possibility that 
is excluded by the absence of a significant overdensity of Lyman-o 
emitters around UM 287 compared to other radio-quiet quasars (see 
Methods). Differently from any previous detection, the nebula is there- 
fore an image of intergalactic gas at z > 2 extending beyond any indi- 
vidual, associated dark-matter halo. The rarity of these systems may be 
explained by the combination of anisotropic emission from the qua- 
sars (typically only about 40% of the solid angle around a bright, high- 
redshift quasar is unobstructed”), the anisotropic distribution of dense 
filaments and light travel effects that, for quasar ages of less than a few 
million years, further limit the possible ‘illuminated’ volume. 

In order to constrain the physical properties of this system, we use a 
set of Lyman-« radiative transfer calculations” combined with adapt- 
ive mesh refinement simulation of cosmological structure formation 
around a dark-matter halo with mass Mpy ~ 10'*°Mo (see Methods). 
We consider two possible, extreme scenarios for the Lyman-o emission 
mechanism of the intergalactic gas associated with the nebula: (1) the 
gas is highly ionized by the quasar and the Lyman-« emission is mainly 
produced by hydrogen recombinations; and (2) the gas is mostly neut- 
ral and the emission is mainly due to scattering of the Lyman-« and 
continuum photons produced by the quasar broad line region. The 
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Figure 1 | Processed and combined images of the field surrounding the 
quasar UM 287. a, b, Each image is 2 arcmin ona side, and the quasar is located 
at the centre. In the narrow-band (NB3985) image (a), which is tuned to the 
Lyman-« line of the systemic redshift for UM 287, we identify very extended 


models are used to obtain scaling relations between the observable 
Lyman-o surface brightness from the intergalactic gas surrounding the 
quasar and the hydrogen column densities (see Extended Data Fig. 3). 
These scaling relations are consistent with analytical expectations. Note 
that the estimated column densities for scenario (1) depend on the 
ionized gas clumping factor (C = < ne’ >/<n,>’, where nz is the electron 
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Figure 2 | Lyman-a image of the UM 287 nebula. We subtracted from 

the narrow-band image the continuum contribution estimated from the 
broad-band images (see Methods). The location of UM 287 is labelled with ‘a’. 
The colour map and the contours indicates, respectively, the Lyman-« (Lya) 
surface brightness (upper colour scale) and the signal-to-noise ratio per arcsec 
aperture (lower colour scale). The extended emission spans a projected 
angular size of ~55 arcsec (about 460 physical kpc), measured from the 
2o(~10 ‘ergs ‘cm 7arcsec *) contours. The object marked with ‘b’ is an 
optically faint (g ~ 23AB) quasar at the same redshift as UM 287 (see Extended 
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(~55 arcsec across) emission. The deep V-band image (b) does not show any 
extended emission associated with UM 287. This requires the nebula to be line- 
emission, and we identify it as Lyman-« at the redshift of the quasar. 


density) below the simulation resolution scale, ranging from about 10 
physical kpc for diffuse intergalactic gas to ~160 physical pc for the 
densest regions within galaxies. 

The results are presented in Fig. 4. The observed Lyman-« emission 
requires very large column densities of ‘cold’ (T <5 X 10*K) gas, up 
to Njy~ 10cm *. The implied total, cold gas mass ‘illuminated’ by 
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Data Fig. 2). The nebula appears broadly filamentary and asymmetric, 
extending mostly on the eastern side of quasar UM 287 up to a projected 
distance of about 35 arcsec (~285 physical kpc) measured from the 2¢ 
isophotal. The nebula extends towards the southeast in the direction of the 
optically faint quasar. However, the two quasars do not seem to be directly 
connected by this structure that continues as a fainter and spatially narrower 
filament. The large distance between the two quasars and the very broad 
morphology of the nebula argue against the possibility that it may originate 
from an interaction between the quasar host galaxies (see Methods). 
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Figure 3 | Luminosity-size relations for previously detected, bright Lyman- 
a nebulae and UM 287. The plot includes nebulae surrounding radio galaxies 
(black circles), radio-loud quasars (blue open squares), radio-quiet quasars 
(blue filled squares) and Lyman-« ‘blobs’ (green triangles). The reported 
luminosities include the Lyman-o (Ly,,,) emission (within the narrow-band 
filters) from any sources embedded in the nebulae, if present. Excluding the 
contribution coming directly from the quasar broad line region, the luminosity 
of the UM 287 nebula corresponds to Lyy., = 2.2 + 0.2 x 10** ergs | (about 
16% of the total luminosity). Error bars for UM 287 represent the 1a 
photometric error including continuum-subtraction (error bar is smaller than 
the symbol size) and an estimate of the error on the projected maximum extent 
using +1o isophotal contours with respect to the 10 'Sergs ‘cm arcsec * 
isophotal. The typical errors for other sources are presented separately in the 
bottom-right corner. The dashed line indicates the virial diameter of a dark- 
matter halo with total mass M ~ 10'7°M. ©» the typical host of radio-quiet 
quasars including UM 287, as confirmed by the analysis of the galaxy 
overdensity in our field (see Methods). The UM 287 nebula, differently from 
any previous detection, extends on intergalactic medium scales that are well 
beyond any possible associated dark-matter halo. Note that even if we restrict 
the size measurement of the UM 287 nebula to the 4 X 10° “ergs” ' cm™ arcsec” 
isophotal to be comparable with the majority of the previous surveys, the 
measured apparent size of the UM 287 nebula will be reduced only by 

about 20%. 


the quasar is M,. ~ 10’? *°°Mq for the ‘mostly ionized’ case (scenario 
(1)) assuming C = 1 and M,,, ~ 10'!4* °°. for the ‘mostly neutral’ 
case (scenario (2)). Note that the total estimated mass for case (1) scales 
as C'””. For comparison, a typical simulated filament in our cosmolog- 
ical simulation of structure formation with size and morphology similar 
to the nebula around a dark-matter halo of mass Mpy ~ 10'*°Mo has 
a total gas mass of about Mgas ~ 10''3Mo, but only about 15% of this 
gas is ‘cold’ (T<5 X 10*K)—that is, Megas ~ 10'°°M ~—and therefore 
able to emit substantial Lyman-« emission. These estimates are con- 
sistent with a large sample of simulated haloes obtained by other recent 
works based on cosmological adaptive mesh refinement simulations’. 
These simulations also show a (weak) decreasing trend of the cold gas 
fraction with halo mass. 

How can we explain the large differences between the estimated 
mass of cold gas in the nebula and the available amount of cold gas 
predicted by numerical simulations on similar scales? One possibility 
is to assume that the simulations are not resolving a large population of 
small, cold gas clumps within the low-density intergalactic medium 
that are illuminated and ionized by the intense radiation of the quasar. 
In this case, an extremely high clumping factor, up to C ~ 1,000, on 
scales below a few kiloparsecs would be required in order to explain the 
large luminosity of the nebula with the cold gas mass predicted by the 
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Figure 4 | Inferred hydrogen column densities associated with the UM 287 
nebula. We have converted the observed Lyman-o surface brightness into 
gas column densities N using a set of scaling relations obtained with detailed 
radiative transfer simulations and consistent with analytical expectations 

(see Extended Data Fig. 3 and Methods). We have explored two extreme cases: 
first, the gas is mostly ionized by the quasar radiation (a; Ny) and second, 
the gas is mostly neutral (b; Nj). Two circular regions with a diameter of 

7 arcsec (~8 times the seeing radius) have been masked at the location of the 
quasars (black circles). The inferred hydrogen column density in a scales as 
C_'/?, where Cis the gas clumping factor on a spatial scale of about 10 physical 
kpc at moderate overdensities (less than about 40 times the mean density of the 
Universe at z = 2.28). The implied column densities and gas masses, in both 
cases, are at least a factor of ten larger than what is typically observed within 
cosmological simulations around massive haloes, suggesting that a large 
number of small clumps within the diffuse intergalactic medium may be 
missing within current numerical models. 


simulations. On the other hand, if some physical process that is not 
fully captured by current grid-based simulations increases the fraction 
of cold gas around the quasar—for example, a proper treatment of metal 
mixing—a smaller clumping factor may be required. In the extreme 
(and rather unrealistic) case that all the hot gas is turned into a cold 
phase, the required clumping factor would be C ~ 20. Even if the gas is 
not ionized by the quasar (scenario (2) above), the simulations are able 
to reproduce the observed mass only if a substantial amount of hot gas 
is converted into a cold phase. Incidentally, this is exactly the same 
result produced by comparing the properties of Lyman-o absorption 
systems around a large statistical sample of quasars with simulations”. 
Proper modelling of this gas phase will require a new generation of 
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numerical models that are able—simultaneously—to spatially resolve 
these small intergalactic clumps within large simulation boxes, and to 
treat the multiphase nature of this gas and its interaction with galaxies 
and quasars. 


METHODS SUMMARY 


We observed UM 287 fora total of 10 h ina series of dithered, 1,200-s exposures. In 
parallel, we obtained 10h of broad-band V images with the LRIS-red camera and 
1h of B-band imaging. For all observations, we used the D460 dichroic beam 
splitter. We binned the blue CCDs 2 X 2 to minimize read noise. The images were 
processed using standard routines within the reduction software IRAF, including 
bias subtraction, flat fielding and illumination correction. A combination of twi- 
light sky flats and unregistered science frames has been used to produce flat-field 
images and illumination corrections for each band. We have calibrated the pho- 
tometry of our images using two spectrophotometric stars (Feige 110 and Feige 34) 
and the standard star field PG 0231+ 051. To isolate the emission in the Lyman-o 
line we estimated and then subtracted the continuum emission from discrete and 
extended sources contained within the NB3985 filter using a combination of the V 
band and B band. We derived a relation between the observable Lyman-o emission 
from diffuse gas illuminated by a quasar and the gas column densities by combin- 
ing a Lyman-o radiative transfer model with the results of a cosmological hydro- 
dynamical simulation of structure formation at z = 2.3 (ref. 5). The cosmological 
simulation consists of a 40° co-moving Mpc’ cosmological volume with a 10° co- 
moving Mpc’ high-resolution region containing a massive halo compatible with 
the expected quasar hosts (Mpm ~ 10'?°Mo). The equivalent base-grid resolution 
in the high-resolution region corresponds to a (1,024°) grid with a dark-matter 
particle mass of about 1.8 X 10° Mo. We adaptively refined the grid by a factor of 
9° reaching a maximum spatial resolution of about 0.6 co-moving kpc, that is, 
about 165 proper pc at z = 2.3.We have then applied in post processing an ion- 
ization and Lyman-e radiative transfer using the RADAMESH adaptive mesh 
refinement code”. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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METHODS 


Observations and data reduction. As part of a continuing programme to search 
for Lyman-o emission associated with the fluorescence of quasar ionizing radiation’, 
we obtained deep, narrow-band imaging of the field surrounding UM 287, also 
known as PHL 868 and LBQS 0049+ 0045. UM 287 was discovered in the University 
of Michigan emission-line survey, has a precisely measured redshift z = 2.279 + 0.001 
based on analysis of [Om] emission lines*', and has a bolometric luminosity 
Lpa 10°77 ergs | estimated from its 1,450-A rest-frame flux using standard 
cosmology”. This places it in the upper quartile of ultraviolet-bright quasars at 
this redshift. Assuming that the spectral energy distribution follows a power law” 
with frequency index « = —1.57 at energies exceeding 1 R, we estimate the lumin- 
osity of ionizing photons™! to be = 10°”* s_' assuming isotropic emission. 

The quasar has no counterpart in the FIRST* images at 20 cm (1.4 GHz), and 
based on the FIRST coverage maps we obtain a 5o flux limit Fyadio < 0.76 mJy, 
which, given its large ultraviolet luminosity, classifies this quasar as radio-quiet”®. 
We selected this source for imaging based solely on its high luminosity, its pre- 
cisely measured redshift, and its radio-quiet characteristics. We purchased a cus- 
tom-designed narrow-band filter from Andover Corporation, sized to fit within 
the grism holder of the Keck/LRISb camera. The filter was tuned to Lyman o at the 
source’s systemic redshift and we requested a narrow band-pass (full-width at half- 
maximum FWHM ~ 3nm) that minimized sky background while maximizing 
throughput. Extended Data Fig. 1 presents the as-measured transmission curve of 
the NB 3985 filter. 

We observed UM 287 on the nights of ur 12-13 November 2013 for a total of 
10h, ina series of dithered, 1,200 s exposures. Conditions were clear, with atmo- 
spheric seeing varying from FWHM ~ 0.6-1 arcsec. In parallel, we obtained 10h 
of broad-band V images with the LRISr camera and 1 h of B-band imaging. For all 
observations, we employed the D460 dichroic beam splitter. We binned the blue 
CCDs 2 X 2 to minimize read noise. 

All of these data were processed with standard techniques. Bias subtraction was 
performed using measurements from the overscan regions of each image. The 
images have been reduced using standard routines within the reduction software 
IRAF, including bias subtraction, flat fielding and illumination correction. A 
combination of twilight sky flats and unregistered science frames has been used 
to produce flat-field images and illumination corrections for each band. Each 
individual frame has been registered on the SDSS-DR7 catalogue using SExtractor” 
and SCAMP* in sequence. The astrometric uncertainty of our registered images is 
about 0.2 arcsec. Finally, for each band (NB3985, B, V), the corrected frames were 
average-combined using SWarp”. 

We have calibrated the photometry of our images in the following manner. 
First, we observed during the two nights two spectrophotometric stars (Feige 110 
and Feige 34) through the narrow-band filter, under clear conditions. For the 
broad band images, we observed the standard star field PG 0231+051. 

To compute the zero-point for the narrow-band images, we first measured the 
number of counts per second of the standard stars Feige 110 and Feige 34. We then 
compared this measurement with the flux expected, estimated by convolving the 
spectrum of the standard star with the normalized filter transmission curve 
(Extended Data Fig. 1). The two measurements agreed to within 0.1 mag. We attri- 
bute the difference to small variations in the transparency and adopt an average zero- 
point of 24,14 mag. The surface brightness limit for our observation in the central 
region of the image occupied by the nebula is about 5 X 10°’ ergs ‘cm’ 7 arcsec’ 7 
at 1o level within an aperture of 1 arcsec’. 

For the broad-band images, we compared the number of counts per second of 
the five stars in the PG 0231 +051 field with their tabulated V and B magnitudes”. 
The derived zero-point for the five stars are consistent with each other within a few 
percent and we adopt the average values: Bzp = 28.40 mag and Vzp = 28.07 mag. 

As the standard stars and the PG 0231+ 051 field were observed with a similar 

airmass of approximately 1.2, which corresponds to the average airmass of our 
observations, we did not correct the individual images before combination. Moreover, 
by monitoring unsaturated stars on several exposures, we estimated that the cor- 
rection would be of the order of a few percent. 
Continuum subtraction. To isolate the emission in the Lyman-a, line we esti- 
mated and then subtracted the continuum emission from discrete and extended 
sources contained within the NB3985 filter. We estimated the continuum using a 
combination of the V-band and B-band images as follows. First, we smoothed both 
of the broad-band images using a Gaussian kernel of 1 arcsec and set to zero all of 
the pixels with values less than the measured root-mean-square (1a). Additionally, 
in the V-band we set to zero all of the pixels which have signal above 1a in the B 
band, as we prefer to use the latter image when possible given that it lies closer in 
wavelength to the Lyman-o line. 

After matching the seeing between the narrow-band and the broad-band images, 
the continuum subtraction has been applied using the following formula 


LETTER 


FWHMyap3085 \ / Trp3985 B 
FWHMg Trg 


b A) (===) Vv 
FWHMy Try 

where Ly« is the final subtracted image, NB3985 is the smoothed narrow-band 
image, B and V are the smoothed and masked broad-band images, and Trnp3og5, 
Trg and Try are the transmission peak values for NB3985, B-band and V-band 
filters, respectively. The parameters a = 0.85 and b = 0.65 allow a better match to 
the continuum. Following this procedure, we primarily used the smoothed B-band 
image to estimate the continuum and we included the V-band to achieve deeper 
sensitivity and to correct those objects not detected in the B-band image. 

Data reduction and analysis for the companion quasar. Upon analysing the 
continuum-subtracted Lyman-o image, we identified a compact Lyman-o excess 
source at 24.3 arcsec separation from UM 287 (corresponding to about 200 phys- 
ical kpc), which has a faint counterpart in our LRIS continuum image and is also 
detected in the SDSS (g = 22.8 + 0.1). Further exploration of this source reveals it 
is detected by the FIRST survey (FIRST J005203.26+010108.6) with a flux 
Foeak = 21.38 mJy, strongly suggesting that this source is a radio-loud but optically 
faint quasar. On uT 08 December 2013, we obtained a long-slit spectrum of 
J005203.26+010108.6 using the Keck/LRIS spectrometer configured with the 
D560 dichroic, the 600/4000 grism in the LRISb camera, and the 600/10000 
grating in the LRISr camera. We oriented the long slit to also cover UM 287. 

These data were reduced with the LowRedux (http://www.ucolick.org/~xavier/ 
LowRedux/index.html) software package using standard techniques. Extended 
Data Fig. 2 presents the two, optimally extracted spectra from the LRISb camera. 
One recognizes the broad and bright emission lines characteristic of type I quasars. 
The redshift estimated from these lines—that has an error of about 800kms"' 
(1o)—is consistent with the systemic redshift of UM 287, suggesting that UM 287 
is actually a member of a binary system with a fainter companion. We emphasize, 
however, that there is very little (if any) Lyman-o emission apparent in the nar- 
row-band image that may be associated with J005203.26+010108.6 apart from 
that produced by its own nuclear activity. 

Because of the large distance from UM 287—at least 200 physical kpc and up to 

4 physical Mpc considering the 1o redshift error—and the morphology of the 
nebula we can exclude the possibility that the UM 287 nebula is the result of tidal 
interaction due to a merging event between the two quasar hosts. Indeed, such a 
large separation would imply that any possible encounter between the two quasars 
is probably a high velocity interaction or an encounter with large impact para- 
meter. We note that it is not impossible but extremely difficult to produce a long 
and massive tidal tail during a ‘fast’ encounter*” and the amount of gas stripped by 
the quasar host galaxies in the best scenario would probably be a very small 
fraction (<10%) of its total interstellar medium. Irrespective of the details of the 
possible interaction between the two quasar host galaxies, any resulting, long tidal 
tail would be very thin with sizes of the order of few kpc or less*’ whereas the 
observed nebula has a FWHM thickness of at least 100 physical kpc in its widest 
point. 
Galaxy overdensity analysis. We have obtained a sample of 60 Lyman- emitter 
(LAE) candidates above a flux limit of 3 X 10 '8 erg s ‘cm? (corresponding to a 
Lyman-o luminosity of about 2 x 10*’ ergs ') within the volume probed by our 
narrow-band imaging (~3,100 co-moving Mpc *) around UM 287. The selection 
is based on the same technique applied to our pilot survey’. 

How does the number density in our survey compare to other similar searches 
around massive objects? Surveys of LAE around HzRGs'** have revealed large 
overdensities of LAEs with respect to field studies at similar redshifts**“* that are 
compatible with the presence of a massive halo as estimated from clustering, that 
is, 10'°M.o. Narrowband imaging of the radio-galaxy MCR 1138-262 at z = 2.16 
(ref. 42), associated with a 200-kpc-scale Lyman-o nebula, found a number density 
of LAE above Lyyq = 1.4 x 10” ergs | of Murg~10+2X10 7 co-moving 
Mpc ° !°. By comparison, the number density of LAE above the same limit at 
the same redshift in the field is gag ~ (1.5 + 0.5) X 107° co-moving Mpc °, 
corrected for completeness*’. If we restricted our sample to the same luminosity 
cut, we found a number density of numag7 ~ (5 + 1) X 10° co-moving Mpc *. 
Note that, at this luminosity, our sample is complete. Despite the large statistical 
errors, we note that the overdensity with respect to the field around UM 287 (about 
a factor of three) is significantly smaller than the overdensity of LAE around 
MCR 1138-262 (about a factor six). A similar result is obtained comparing the 
overdensity of LAE around UM 287 with other HzRGs"’, suggesting that UM 287 
is hosted by a smaller halo than typical HzRG hosts. Moreover, the modest over- 
density of our field is strong evidence against the possibility that the UM 287 nebula 
may be fully contained by an individual dark-matter halo of mass 10'°° Mo, as 
would be required by its size. Note that the galaxy number density estimate around 


Lya = NB3985 o( 
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UM 287 isa conservative upper limit: if the quasar is illuminating the surrounding 
volume, we expect a boost in the number of detectable LAE objects due to fluo- 
rescence, as demonstrated in our pilot survey’. Our measurement is also compat- 
ible with the number density of LAEs found by other recent, shallower surveys for 
Lyman-o emission around eight radio-quiet, bright quasars* at z ~ 2.7 that havea 
host halo mass of 10'?° M as constrained by the clustering of Lyman break gal- 
axies. These studies found number densities ranging from 6 X 10 * to 22 X 10°? 
co-moving Mpc * around individual quasars above a Lyman-o luminosity of 
Liya = 5.8 X 10“ ergs’. Combining the 8 fields, the average number density 
from their survey is (12.0 + 0.4) X 107? co-moving Mpc *. 

Using the same luminosity cut, we find a number density of (12 + 2) X 107° co- 
moving Mpc *, suggesting that the halo mass of UM 287 is indeed within the 
typical range for the host haloes of radio-quiet quasars. 

Converting the observed Lyman-a, emission to gas column densities. We 
derived a relation between the observable Lyman-o emission from diffuse gas 
illuminated by a quasar and the gas column densities by combining a Lyman-a 
radiative transfer model with the results of a cosmological hydrodynamical simu- 
lation of structure formation at z = 2.3 (ref. 5). The cosmological simulations have 
been obtained with the adaptive mesh refinement code RAMSES” and consist of 
a 40° co-moving Mpc cosmological volume with a 10° co-moving Mpc? high- 
resolution region containing a massive halo compatible with the expected quasar 
hosts, that is, with a dark-matter mass Mpy ~ 10'?°M. The equivalent base-grid 
resolution in the high-resolution region corresponds to a (1024”) grid with a dark- 
matter particle mass of about 1.8 X 10° Mo. We used other additional 6 grid 
refinement levels, reaching a maximum spatial resolution of about 0.6 co-moving 
kpc, that is, about 165 physical pc at z = 2.3. Star formation, supernova feedback, 
and an optically thin ultraviolet background with an on-the-fly self-shielding 
correction are included using a typical choice of sub-grid parameters for the simu- 
lation resolution’. We have then applied in post processing an ionization and 
Lyman-« radiative transfer using the RADAMESH adaptive mesh refinement 
code”. Ionization, Lyman-o and non-ionizing continuum radiation from the quasar 
broad line region is propagated within two symmetric cones that cover half of the 
solid angle around the quasar. We included light-travel and finite light-speed 
effects for both ionizing and Lyman-z radiation transfer and varied the quasar 
age (from 1 Myr to 10 Myr) and the orientation of the emission cones with respect 
to the observer line-of-sight and the cosmic web surrounding the simulated halo. 
We note that these effects are able to produce asymmetric Lyman-« nebulae with 
sizes and morphologies similar to the observations for short quasar ages (<5 Myr). 

In order to produce a calibrated relation for scenario 1 as discussed in the main 
text, we have fixed the quasar ionizing and Lyman-« luminosity to the observed 
value and assumed that the ionizing and Lyman-« emitting cones are coincident. 
We have then produced mock images with the same angular resolution of the 
observation that have been convolved with a point spread function (PSF) with 
larcsec size to simulate atmospheric seeing. A column density map of cold 
(T<5 X 10*K) ionized hydrogen was produced from the simulations considering 
only the gas ‘illuminated’ by the quasar and convolved with the same PSF. We have 
then cross-correlated the two quantities pixel by pixel and fitted the calibrated 
relation shown as a solid line in the left panel of Extended Data Fig. 3. This relation 
is consistent with analytical expectations from highly ionized gas where the Lyman-a 
emission is mostly produced by hydrogen recombination with a negligible contri- 
bution from collisional excitations and Lyman-o scattering (or photon-pumping) 
from the quasar non-ionizing continuum and Lyman-z radiation’. We have 


repeated the experiment varying the sub-grid clumping factor (C) below the simu- 
lation resolution and found, as expected for highly ionized gas, that the simulated 
surface brightness scales linearly with C at a given gas column density. 

We have also considered the extreme case in which the simulated gas is only 
illuminated by non-ionizing radiation from the quasar, and therefore that dense gas 
in the simulation remains mostly neutral (scenario 2 in the main text) above the self- 
shielding density to the cosmic ultraviolet background (about 0.01 atoms cm~*). 
We obtained and post-processed a mock image as in the previous case and cross- 
correlated the resulting Lyman-« surface brightness with the neutral hydrogen 
column densities (Nj). Despite the large scatter, we found a good correlation 
between these two quantities (right panel of Extended Data Fig. 3) if the surface 
brightness is normalized by the impact parameter (b) squared. The relation 
between the Lyman-« surface brightness, neutral hydrogen column density and 
impact parameter is consistent with simple analytical expectations from pure 
Lyman scattering from the broad line region of the quasar for Lyman-z optical 
depth much larger than unity. In this case, the amount of photon-pumping (or, 
analogously, the equivalent width of the absorbed quasar Lyman-o and continuum 
een is dominated by the line damping wing and therefore is proportional to 
Nu”. 
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Extended Data Figure 1 | Measured transmission curves of the filters used Bottom axis, observed wavelength; top axis, the rest-frame wavelength for 
in this study. Solid line, NB3985; dotted lines, B band (left) and V band (right). sources at z = 2.27. 


©2014 Macmillan Publishers Limited. All rights reserved 


3500 


3000 


2500 


2000 


1500 


Relative Flux 


1000 


3500 4000 4500 5000 5500 
Wavelength (Ang) 


Extended Data Figure 2 | Keck/LRIS spectrum of UM 287 and of the faint, | UM 287. Blue line, spectrum of UM287. Comparison of the two spectra clearly 
radio-loud companion quasar. Black line, spectrum of this companion quasar _ shows that this companion is a quasar at a redshift similar to that of UM 287. 
which is indicated by ‘b’ in Fig. 2 and is separated by about 24 arcsec from 
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Extended Data Figure 3 | Pixel-to-pixel correlations for Lyman-a surface 
brightness for scenarios 1 and 2 in the main text. a, Pixel-to-pixel correlation 
between simulated Lyman-o surface brightness (SB) divided by the clumping 
factor (C) and corresponding cold (T <5 X 10*K) ionized hydrogen column 
densities Nyy for scenario 1 (see text for details). The solid line indicates 

the relation Nyy = 107! X (SB)? x C_”? (here SB is in units of 
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1018 erg s 'cm “arcsec * and C is dimensionless). b, Pixel-to-pixel 
correlation between simulated Lyman-o surface brightness (normalized by the 
quasar impact parameter squared, b”) and corresponding neutral hydrogen 
column density for scenario 2 (see text for details). The solid line represents the 
relation Ny: = 10'°” X [(SB) X (b/100)?]? cm * (here b is in units of kpc). 
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Symmetry permeates nature and is fundamental to all laws of phys- 
ics. One example is parity (mirror) symmetry, which implies that 
flipping left and right does not change the laws of physics. Laws for 
electromagnetism, gravity and the subatomic strong force respect 
parity symmetry, but the subatomic weak force does not'”. Histor- 
ically, parity violation in electron scattering has been important in 
establishing (and now testing) the standard model of particle physics. 
One particular set of quantities accessible through measurements of 
parity-violating electron scattering are the effective weak couplings 
C)q sensitive to the quarks’ chirality preference when participating in 
the weak force, which have been measured directly** only once in the 
past 40 years. Here we report a measurement of the parity-violating 
asymmetry in electron-quark scattering, which yields a determina- 
tion of 2C,, — Coa (where u and d denote up and down quarks, res- 
pectively) with a precision increased by a factor of five relative to the 
earlier result. These results provide evidence with greater than 95 per 
cent confidence that the C,, couplings are non-zero, as predicted by 
the electroweak theory. They lead to constraints on new parity-violating 
interactions beyond the standard model, particularly those due to 
quark chirality. Whereas contemporary particle physics research is 
focused on high-energy colliders such as the Large Hadron Collider, 
our results provide specific chirality information on electroweak 
theory that is difficult to obtain at high energies. Our measurement 
is relatively free of ambiguity in its interpretation, and opens the door 
to even more precise measurements in the future. 

In parity-violating electron scattering (PVES) experiments, an asym- 
metry is measured that can be expressed as 


OR— OL 
Apy = 1 
sco ears (1) 


where o(o,) are the cross-sections for scattering longitudinally polar- 
ized electrons that are in the right-handed R (left-handed L) helicity 
state, meaning their spins are parallel (antiparallel) to the electron’s 
momentum. For deep inelastic scattering (DIS) from nuclear targets 
(DIS is defined as scattering in which the electron interacts with a single 
quark, almost independent of the surrounding quarks and gluons), this 
asymmetry can be written in a largely model-independent way as” 


GrQ? 


ANS 2’) Yi (x,y,Q? 
PV ie 1 (x,y,Q°) 2) 
+a3(x,Q’) Y3(x,y,Q’)] 
where Grp is the Fermi constant, « is the fine-structure constant, Q? = — g 


with q the four-momentum transferred from the electron to the target, 
x is the Bjorken scaling variable and describes the fraction of momen- 
tum carried by the quark struck by the electron, y = (E — E’)/E is the 
fractional energy loss of the electron with E(E’) the incident (scattered) 
electron energy, Y,,3 are kinematic factors, and the variables a,,; are 
related to the subatomic structure of the target. (See Supplementary 
Methods for a complete description.) The first experiment (SLAC E122) 
to detect parity violation in electron scattering** provided results that 
strongly favoured the model of refs 6-8, establishing it as the keystone 


of the now highly successful standard model of particle physics. PVES 
has subsequently been used as a sensitive probe to study diverse phys- 
ics, ranging from physics beyond the standard model”" to the struc- 
ture of both nuclei’! and the nucleon (ref. 12 and references therein). 

In so-called tree-level scattering, where the electron exchanges only 
a single photon or a single Z boson with the target, very simple expres- 
sions for @,,3 in equation (2) emerge for electron DIS from deuterium: 


6 6 
a= 5 (2Ciu —C\a), 43 = 5 (2Coy — Coa) (3) 


The use of the deuterium target simplifies the interpretation because it 
has equal numbers of up and down valence quarks. Here, Cj,,1¢) and 
Cyua are the effective weak couplings between the electrons and the 
up (down) quarks, often collectively written as C,, and C24. The sub- 
scripts 1 and 2 refer to whether the coupling to the electron or quark is 
vector or axial-vector in nature: C),,q is the (AV) combination of the 
electron’s axial-vector weak charge and the quark’s vector weak charge, 
that is, it probes parity violation caused by the difference in the Z° 
coupling between left- and right- handed electron chiral states; Cy,,(q) is 
the (VA) combination of the electron’s vector weak charge and the 
quark’s axial-vector weak charge that is sensitive to parity violation due 
to the different quark chiral states. In testing the standard model it 
is important to determine all four C\,,1421,24 aS accurately as possi- 
ble, because new interactions could manifest themselves in either set 
of couplings. Experimentally, one could extract both 2C,,, — Cig and 
2C2, — Coq by measuring asymmetries at different Y,,; values in the 
DIS regime. However, a precise determination of 2C),, — Coqis difficult 
because of its small value in the standard model (—0.095), as opposed 
to 2Ciu 2 Cia = —0.719. 

The new measurement reported here was performed using the elec- 
tron beam at the Thomas Jefferson National Accelerator Facility (referred 
to here as Jefferson Lab), in Virginia, USA. A 100-yA, nearly 90%- 
longitudinally-polarized electron beam was incident on a 20-cm-long 
liquid deuterium target held at a temperature of 22 K. Scattered par- 
ticles were detected in a pair of magnetic spectrometers that deter- 
mined the momentum and the direction of the detected particles to 
high precision”’. To directly access C2,,,24 the kinematics were chosen 
so that the bulk of the detected electrons emerged from the target after 
undergoing a DIS interaction. In contrast, all other PVES experiments 
after SLAC E122 were performed outside the DIS regime, and thus 
could not provide clean information on C),. 

The size of the asymmetry expected for this measurement is at the 
level of 10° *. The major challenge comes from the combination of the 
high electron event rate, and the high pion background typical of DIS 
measurements. This was overcome by the use of a custom electronic and 
data acquisition (DAQ) system with built-in pion rejection capability™*. 
The DAQ system successfully counted electrons, event-by-event, at rates 
up to 600 kHz. The relative uncertainty in the measured asymmetries 
due to pion background was less than 5 x 107+, and that due to count- 
ing deadtime was less than 0.4%. The leading systematic uncertainty 
comes from the normalization by the electron beam polarization, 
which had a relative uncertainty of (1.2-1.8)%. Beam instability was 


*Lists of participants and their affiliations appear at the end of the paper. 
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not a significant issue because of recent advances in the monitoring 
and feedback control of the beam, a direct outcome of some of the earlier 
PVES studies’"". 

The high intensity of the Jefferson Lab beam allowed the completion 
of the experiment in just under two months. A total of about 170,000 
million scattered electrons were counted at two DIS settings. The asym- 
metry measured at E = 6.067 GeV, (x) = 0.241, Y, = 1.0, Y3 = 0.44 and 
(Q’) = 1.085 (GeV c_')? was 


Aexp =[—91.143.1(stat.) +3.0(syst.)] x 107° (4) 


where (x) and (Q’) are averaged over the spectrometer acceptance, and 
stat. and syst. indicate statistical and systematic errors, respectively. This 
result is to be compared with the standard model (SM) expectation of 
Asm = —87.7 X 10 °, with an uncertainty of 0.7 X 10° ° dominated by 
the uncertainty in the parton distribution functions (PDFs), parame- 
terizations of how partons (quarks and gluons) that form the nucleon 
carry the nucleon’s energy. To allow an extraction of Ci,,1q and Coy,24 
it is necessary to express the asymmetry in terms of these couplings. 
This relation was calculated using the MSTW2008 leading-order PDF 
parametrization". For the kinematics above, it gives Asq = (1.156 X 107“) 
[(2C,,, — Cig) + 0.348(2C,, — Crg)], where the relative uncertainties 
of the coefficients for the (2C,,, — C,q) and the (2C,, — C4) terms are 
0.5% and 5%, respectively. The second DIS setting was at E = 6.067 GeV, 
(x) = 0.295, Y; = 1.0, Y3 = 0.69, (Q’) = 1.901 (GeV c_')’, and the result 
was: 


Aexp =[— 160.8 + 6.4(stat.) + 3.1(syst.)] x 10° (5) 


The standard model expectation is Agu = (— 158.9 + 1.0) X 10°. The 
coupling sensitivity is Ag), = (2.022 x 107 *)[(2C,,, — Cia) + 0.594(2C),, 
— Cy4)], with the same relative uncertainties as the first DIS setting. 
Details of the standard model calculation and the uncertainty due to 
PDF fits are given in Supplementary Methods. 

Using the most recent world data for the coupling C),,14 (ref. 16), 
obtained from PVES and caesium atomic parity violation experiments'””°, 
a simultaneous fit of 2C,,, — Cyg and 2C,, — Cog to our results and to 
the asymmetries from SLAC E122 was performed, yielding: 


(2Coy = Coa) lo = —0.145 +0.066(exp.) 
+0.011(PDF) +0.012(HT) (6) 
= — 0.145 +0.068(total) 


Here, exp. refers to the total experimental uncertainty, given by the sta- 
tistical and the systematic uncertainties of the asymmetry results added 
in quadrature. The third uncertainty is due to the so-called higher-twist 
(HT) effects, caused by interactions among quarks inside the target. 
Further theoretical uncertainties, including QED vacuum polarization 
and the yZ box diagram, are negligible compared to the uncertainty due 
to the PDF fits. Electroweak and process-specific radiative corrections 
have been applied to calculate the values at zero- , Cou2dlee —o called 


gits°4 with e referring to electrons (and similarly C,,,,14| q@=o called nee 


in ref. 21, so that the values in equation (6) can be compared directly to 
results from other precision experiments using different kinds of pro- 
cesses. The values for Cru,2d| qe _o differ from those at both Q’ accessed 
in this experiment by 0.002-0.003 for both the up and the down quarks. 

The asymmetry results in equations (4) and (5) can also be inter- 
preted as a determination of the weak mixing angle Oy, an important 
ingredient of the electroweak unification of the standard model. The 
result, evolved to the mass of the Z boson in the modified minimal 
subtraction (MS) scheme, is s = sin’ Oy, (@ =2,N8) = 0.2299 + 0.0043, 
in agreement with the latest standard-model fit to world data, 
3%, =0.23126 + 0.00005. 

The result in equation (6) is compared with the standard-model 
prediction 2C}, — Cralge 9 = —0.0950 + 0.0004 in Fig. 1. Our results 
have greatly improved the uncertainty on the effective electron-quark VA 
weak couplings Cp,,24and are in good agreement with the standard-model 
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prediction. This is also the first direct measurement of the coupling 
combination 2C ,, — Cog that deviates from zero. We note that evid- 
ence for non-zero values of the C2,,,24, possibly in a different combina- 
tion from what we measured, may have been observed in experiments 
measuring the nucleon axial form factors”. However, extraction of 
Cyu2a from the nucleon axial form factor is model-dependent, whereas 
in DIS the electron probes quarks unambiguously. The directness of 
our approach is essential to reach a significantly higher accuracy in the 
future, such as through the PVDIS programme planned for the 12 GeV 
upgrade of Jefferson Lab. 

A comparison of the present result with the standard-model pre- 
dictions can be used to set mass limits 1 below which new interactions 
are unlikely to occur. For the cases of electron and quark composite- 
ness and contact interactions, we used the convention of ref. 23 and the 
procedure in ref. 24. The limit for the constructive (destructive) inter- 
ference contribution to the standard model is: 


1/2 


:: (7) 


| (2Coy — Crd) =0 


oh 
where |(2Ceu — Cai) ge =0| is the difference between the standard- 


model value and the upper (lower) confidence bound of the data, 


v=4//2 j (2Gp) = 246.22 GeV is the Higgs vacuum expectation value 


setting the electroweak scale, and the J5 is a normalization factor 
taking into account the coefficients of the C>,,24 in the denominator. 
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Figure 1 | Comparison of the present results with those of earlier 
experiments and predictions of the standard model. Values of 

(2Cin— Cia) es =o and (2C2, — Coa) le =o from this experiment (ellipse with 
blue horizontal hatching) are compared with those of SLAC E122 (yellow 
ellipse)**. The latest data on C,, (from PVES”* and atomic Cs'’®) are shown as 
the band with magenta vertical hatching. The ellipse with diagonal green 
hatching shows the combined result of SLAC E122 and the latest Cip while the 
ellipse with red cross-hatching shows the combined result of SLAC E122, this 
experiment, and the latest C,,. The standard model value (with negligible 
uncertainty) is shown as the black dot, where the size of the dot is for visibility. 
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Figure 2 | Mass exclusion limits / on the electron and quark compositeness 
and contact interactions. These limits are obtained from the zero-Q” values of 
2C,,, — Cig and 2C,,, — Cyq at the 95% confidence level. The outside of the 
yellow shape shows the limit obtained from SLAC E122 asymmetry results** 
combined with the best C,, values’®. The outside of the red shape shows the 
limit with our new results added. For visual guidance, mass limit scales in TeV 
are shown as solid and dashed circles. 


For a 95% confidence level, we extracted 
At=5.8TeV and A =4.6 TeV (8) 


for constructive and destructive interference from beyond-the-standard- 
model physics. Figure 2 illustrates these limits. The limits set by Ci,1a 
are determined mostly by previous PVES and caesium atomic-parity- 
violation results, but this experiment clearly improves the limits set by 
Cou,2d- 

The strength of our results reported here is that they isolate a well- 
defined combination of the electron—quark contact interactions. We 
note that mass limits on the electron—quark contact interactions have 
been published by the ZEUS” and H1” collaborations at the Hadron- 
Electron Ring Accelerator, HERA. They find A* =3.3TeV and 
A” =32 TeV (ref. 25), and A* = 3.8 TeV and A” = 3.6 TeV (ref. 26), 
respectively, on the electron-quark VA term. Similar limits of 
A* =9.5TeV and A” = 12.1 TeV have been published by the ATLAS 
collaboration” at the LHC in the left-left isoscalar model. To account 
for the different chirality structure of the models used, the HERA limits 
on the electron—quark VA model need to be scaled by 274 = 0.84, 
while the LHC limits using the left-left isoscalar model need to be scaled 
by 24 = 1.19, in order to be compared to the mass limits extracted 
from C},,24- The HERA and the LHC measurements are sensitive to 
several different vector and axial-vector weak charge combinations, 
thus their limits were obtained with the assumption that, apart from 
the particular chirality combination used in the model, all other con- 
tact interactions are zero. This assumption is unnecessary for the extrac- 
tion of mass limits from our results. The chiral structure of the effective 
electron-quark weak couplings Cz, isolates interactions beyond the 
standard model in which it is the chirality of the quarks that is respon- 
sible for the observed parity violation. 


METHODS SUMMARY 


The parity-violating asymmetry A,,, between right- and left-handed electrons was 
computed from the detected counts C, normalized by the beam intensity I, and 
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integrated over periods of stable beam helicity. Two kinds of corrections were then 
made to the asymmetries: overall normalization factors and possible systematic 
shifts due to false asymmetries arising from backgrounds or helicity correlations in 
the beam parameters. The normalization factors include the beam polarization, 
measurements of scattered-electron kinematics, electromagnetic radiative correc- 
tions, and effects from two-photon exchange between the electron and target. The 
false-asymmetry corrections were all very small compared to the statistical error and 
included an evaluation of helicity correlations in beam current, position and energy, 
and backgrounds such as pions, scattering from the target aluminium windows, or 
rescattering inside the spectrometers. A summary of all corrections and the asym- 
metry results is presented in Supplementary Table 1. 

To calculate the standard-model expectation of the measured asymmetry and 
its sensitivity to 2C,,, — Cjqand 2C},, — Cyg, we used PDFs to calculate the struc- 
ture functions in a,,3. Three PDF fits were used. Results of the calculation are 
shown in Supplementary Table 2. The relative variation among all three fits is less 
than 0.6% for the a, term, and less than 5% for the a; term of the asymmetry. Effects 
from interactions among quarks inside the target, called ‘higher-twist effects’, were 
evaluated using the most recent theoretical bounds combined with data on neutrino 
structure functions. It was found that the uncertainty in the extraction of 2C2,, — Cra 
due to the higher-twist effects is at the same level as that due to the PDFs, and is 
quite small compared to the experimental uncertainties. 
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Progress in atomic, optical and quantum science’” has led to rapid 
improvements in atomic clocks. At the same time, atomic clock research 
has helped to advance the frontiers of science, affecting both fun- 
damental and applied research. The ability to control quantum states 
of individual atoms and photons is central to quantum information 
science and precision measurement, and optical clocks based on single 
ions have achieved the lowest systematic uncertainty of any frequency 
standard’ °. Although many-atom lattice clocks have shown advan- 
tages in measurement precision over trapped-ion clocks®’, their 
accuracy has remained 16 times worse* '°. Here we demonstrate a 
many-atom system that achieves an accuracy of 6.4 X 107 '8, which 
is not only better than a single-ion-based clock, but also reduces the 
required measurement time by two orders of magnitude. By sys- 
tematically evaluating all known sources of uncertainty, including 
in situ monitoring of the blackbody radiation environment, we 
improve the accuracy of optical lattice clocks by a factor of 22. This 
single clock has simultaneously achieved the best known performance 
in the key characteristics necessary for consideration as a primary 
standard—stability and accuracy. More stable and accurate atomic 
clocks will benefit a wide range of fields, such as the realization and 
distribution of SI units’’, the search for time variation of funda- 
mental constants’, clock-based geodesy"* and other precision tests 
of the fundamental laws of nature. This work also connects to the 
development of quantum sensors and many-body quantum state 
engineering“ (such as spin squeezing) to advance measurement pre- 
cision beyond the standard quantum limit. 

Accuracy for the SI (International System of Units) second is cur- 
rently defined by the caesium (Cs) primary standard. However, optical 
atomic clocks have now achieved a lower systematic uncertainty’ **”. 
This systematic uncertainty will become accuracy once the SI second 
has been redefined. Neutral atom clocks with many ultracold atoms 
confined in magic-wavelength optical lattices'* have the potential for 
much greater precision than ion clocks’*'*. This potential has been 
realized only very recently owing to the improved frequency stability of 
optical local oscillators’*'”"’, resulting in a record single-clock insta- 
bility of 3.1 X 10° '°/,/t, where t is the averaging time in seconds®, 
This result represents a gain bya factor of 10 in our clock stability, allow- 
ing for a factor-of-100 reduction in the averaging time that is required 
to reach a desired uncertainty®. Equivalent instability at one second has 
also been recently achieved with ytterbium (Yb) optical lattice clocks’ 
and averaging for seven hours was demonstrated, down to about 2 X 10° '® 
for a single clock. We used this measurement precision to evaluate the 
important systematic effects that have limited optical lattice clocks, 
and we achieve a total systematic uncertainty in fractional frequency 
of 6.4X 10 18, which is a factor-of-22 improvement over the best 
published total uncertainties for optical lattice clocks*"°. 

Now that the clock systematic uncertainty has been fully evaluated, 
it is a frequency standard at which the statistical uncertainty matches 
the total systematic uncertainty within 3,000 s. Combining improved 


clock designs with this measurement precision has allowed us to over- 
come two main obstacles to achieve the reductions in uncertainty reported 
here. First, we must understand and overcome the atomic-interaction- 
induced frequency shifts inherent in many-particle clocks'?*!. We have 
now determined this effect with 6 X 10°!” uncertainty. Second, we need 
to measure the thermal radiation environment of the lattice-trapped 
atoms accurately, because this causes the largest systematic clock shift, 
known as the blackbody radiation (BBR) Stark shift. Incomplete knowl- 
edge of the thermal radiation impinging upon the atoms has so far dom- 
inated lattice clock uncertainty. We demonstrate that a combination 
of accurate in situ temperature probes and a thermal enclosure sur- 
rounding the clock vacuum chamber allows us to achieve an overall 
BBR shift uncertainty of 4.1 x 10~'*. This progress was enabled by a 
precise measurement (performed at the Physikalisch-Technische 
Bundesanstalt) of the Sr polarizability’, which governs the magnitude 
of the BBR shift. Furthermore, we compared two independent Sr clocks 
and they agree within their combined total uncertainty of 5.4 X 10° '” 
over a period of one month. 

To demonstrate the improved performance of lattice clocks, we built 
two Sr clocks in JILA®”* (see the Methods Summary for details). Herein 
we refer to the first-generation JILA Sr clock as SrI and the newly con- 
structed Sr clock as SrII. The recent improvement of low-thermal- 
noise optical oscillators allowed us to demonstrate the stability of both 
Sr clocks, reaching within a factor of 2 of the quantum projection noise 
limit for 2,000 atoms*. We constructed the SrII clock with the goal of 
reducing the atomic-interaction-related and BBR-related frequency 
uncertainties. Thus, SrII has an optical trap volume about 100 times 
larger than that of SrI to reduce the atomic density, along with in situ 
BBR probes in vacuum to measure the thermal environment of the 
atoms, achieving a total systematic uncertainty of 6.4 10 '*. The 
improvement of Srl, on the other hand, has been a modest factor of 
2 over our previous result’, now achieving a total systematic uncer- 
tainty of 5.3 X10 1”. 

A major practical concern is the speed with which these clocks reach 
agreement at their stated uncertainties. Hence, the low instability of these 
Sr clocks (3 X 10— 18 at about 10,000 s), displayed as the Allan deviation 
of their frequency comparison in Fig. 1a, is critical for evaluating sys- 
tematic effects in a robust manner. Figure 1b documents a comparison 
of the Srl and SrII clocks over a period of one month, showing that their 
measured disagreement of Vgry — Vsrt = —2.8 X 10 '’, with2 x 10°18 
statistical uncertainty, is within their combined systematic uncertainty 
of 5.4 X 101”. The Allan deviation and the binned intercomparison 
data showcase the stability and reproducibility of these clocks on both 
short and long timescales. This performance level is necessary for a 
rigorous evaluation of clock systematics at the 10 "* level. 

Srl and SrII independently correct for systematic offsets to their mea- 
sured atomic frequencies. Table 1 lists the major sources of frequency 
shifts 4 and their related uncertainties o that affect both clocks. The SrI 
clock uncertainty is dominated by its BBR shift uncertainty of 4.5 x 10°”. 
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Figure 1 | Clock comparisons between Srl and SrII. a, Allan deviation of the 
Srl and SrII comparison divided by \/2 to reflect the performance of a single 
clock. The red solid line is the calculated quantum projection noise for this 
comparison. The green dashed line is a fit to the data, showing the worst case 
scenario for the averaging of a single clock of 3.4 X 101° at one second. The 
vertical blue lines represent the 1o standard errors for the Allan deviation. 

b, The absolute agreement between SrI and SrII recorded at the indicated 
Coordinated Universal Time. The light-green region denotes the 1a combined 
systematic uncertainty for the two clocks under the running conditions at that 
time. The top panel shows the frequency record binned at 60 s; in the bottom 
panel each solid circle represents 30 min of averaged data. In the bottom panel, 


small solid black lines represent the 1¢ standard errors inflated by the square 
root of the reduced chi-squared, ven duced’ For clarity, we have omitted the 


error bars in the top panel. The green dashed lines represent the 1o standard 
error inflated by the square root of the reduced chi-squared for the weighted 
mean of these binned comparison data. The final comparison over 52,000 s of 


data showed agreement at —2.7(5) X 101 ( Yeuduced = 10.5) for the 30-min 


averaging time and —2.8(2) X 10777 (\/7?.suceq =3-5) for the 60-s averaging 
time (see Methods). 


For SrII, on the other hand, all sources have been evaluated to produce 
uncertainties better than 4 X 10°". 

The largest improvement (compared to other lattice clocks) in the 
total systematic uncertainty of SrII was obtained through control of the 
BBR shift. We enclosed the entire clock apparatus inside a BBR shield- 
ing box (Fig. 2a). Our lasers for cooling, trapping and clock spectroscopy 
are delivered to the inside of the BBR shielding box by optical fibres, 
preventing stray radiation from entering. We have also installed two 
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Table 1 | Frequency shifts and related uncertainties for Srl and Srll 


Sources for shift Asn Osrl Agri Osril 

BBR static —4,832 45 -4,962.9 18 

BBR dynamic -—332 6 —345.7 3.7 
Density shift —84 12 -4.7 0.6 
Lattice Stark —279 11 —461.5 aT 

Probe beam a.c. Stark 8 4 0.8 1.3 
First-order Zeeman ) <0.1 0.2 1.1 
Second-order Zeeman =175 1 —144.5 1.2 
Residual lattice vector shift ) <0.1 0 <0.1 
Line pulling and tunnelling ) <0.1 0 <0.1 
d.c. Stark —-4 4 =—3.5 2.1 
Background gas collisions 6) 0.07 0 0.6 
AOM phase chirp ay 20 0.6 0.4 
Second-order Doppler ) <0.1 0 20,1 
Servo error 1 4 0.4 0.6 

Totals —5,704 53 —5,921.2 6.4 


Shifts and uncertainties are given in fractional frequency units multiplied by 10718. Uncertainties are 
quoted as lo standard errors. They are determined with the square root of the quadrature sum of the 
systematic error and statistical error, with the latter quantity inflated by Tussi For Srl, the significant 
digit for each uncertainty ends at the 1 x 10718 level; for Srll, the significant digit is extended to the 
1 x 1071° level. See the text and Methods for a detailed discussion of all these systematic uncertainties, 
including the hyperpolarizability effect of the lattice Stark shift. 


in situ silicon diode temperature sensors (with calibrations traceable to 
the National Institute of Standards and Technology (NIST)) near the 
atoms to measure their radiative heat environment (Fig. 2a). The sen- 
sors were affixed to separate glass tubes (Fig. 2a), which prevented 
parasitic heat conduction from the chamber to the sensors by provid- 
ing insulation and radiative dissipation of conductive heat. To improve 
the radiative coupling, the surfaces of the sensors were coated with high 
absorptivity, ultrahigh-vacuum-compatible paint. One sensor was 
mounted 2.54m away from the atoms and provided real-time tem- 
perature monitoring during clock operation. The second sensor was 
affixed to an in-vacuum translator, allowing us to map the temperature 
gradients near the lattice-confined atoms (inset to Fig. 2c). During clock 
operations the mechanical translator was retracted to avoid interfer- 
ence with atoms (Fig. 2b). Systematic errors in both the readout of the 
sensors and their ability to determine the actual thermal distribution at 
the position of the atoms resulted in an overall uncertainty of 26.7 mK 
for the stated BBR temperature. Table 2 lists the sources of uncertain- 
ties for this temperature evaluation. 

The atoms are influenced not only by the total integrated power of 
the BBR inside the chamber, known as the BBR static correction, but 
also the frequency-weighted spectrum of the radiation inside the chamber, 
known as the BBR dynamic correction. We constructed a ray-tracing 
model of our chamber to estimate the influence of temperature gradi- 
ents throughout the vacuum chamber”*”’. The model predicts the error 
incurred in the BBR dynamic correction by calculating the deviation 
from a perfect BBR spectrum for the temperature read out by the sensor”® 
(see Methods). A perfect BBR environment corresponds to a spatially 
uniform temperature; deviations from such an environment cause a 
temperature gradient inside the chamber. As shown in Fig. 2c, com- 
ponents that couple strongly to the sensor, such as the large viewports, 
would need to deviate in temperature from the rest of the chamber bya 
significant amount (more than 10 K) for this error to reach the 1 X 10— a 
level. Furthermore, within our BBR shielding box with small temper- 
ature gradients, the dominant emissivity-weighted solid angle of the 
vacuum viewports made the model’s predictions for the dynamic BBR 
correction insensitive to the exact emissivity values. We ensured that 
the BBR shielding box is fully sealed from the outside environment, 
‘forbidding’ the atoms to ‘view’ any highly emissive object with a tem- 
perature differing from the inside ambient temperature. This BBR shield- 
ing box also allowed the clock vacuum chamber to be insulated from 
room-temperature variations, because it reaches an equilibrium tem- 
perature of 301 K after two hours of clock operation. 

Most systematics listed in Table 1 are rapidly measured through self- 
comparison with digital lock-in°”’ (see Methods). Both Srl and SrII mea- 
sure their systematics by modulating a particular physical parameter 
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Figure 2 | Characterizing BBR effects on the 'Sp—*Py transition. 

a, A three-dimensional model of the clock vacuum chamber. The sensor 
mounted on an in-vacuum translator is depicted in its fully extended mode of 
operation. The entire clock chamber resides inside a BBR shielding box with an 
equilibrium temperature of 301 K. b, A photograph of the two glass tubes 
surrounding the trapped *’Sr atoms (red arrow). The movable sensor 

(green arrow) has been retracted for its normal operation. c, The error inherent 
in assuming a perfect BBR spectrum inside the vacuum chamber, based on a 
measurement of total BBR radiated power. Modelling all components of the 


every two experimental cycles, with the clock laser serving as a stable 
reference with which to measure the related frequency shifts. For example, 
the atomic density shift was measured to high precision with this method. 
The SrlI system is designed to reach a density shift uncertainty below 
1X 10 '* by using large lattice trapping volumes. To accommodate 
this, we used a Fabry-Perot buildup cavity to achieve a sufficiently deep 
lattice. This trap design increases the number of atoms loaded into the 
lattice at a decreased atomic density, allowing SrII to measure an already- 
reduced density shift to very high precision. Details of the SrI and SrII 
optical lattice trap geometries can be found in ref. 6 and in Methods. 
Frequency shifts induced by the optical lattice potential must be under- 
stood and controlled at an extremely high level of precision, especially 
for optical lattices that trap weakly against gravity’. We used a variety 
of methods to stabilize the lattice scalar, vector and tensor Stark shifts. 
An 813-nm continuous-wave Ti:sapphire laser was used to create the 
lattice light for SrII. The clean spectrum of the solid-state laser has the 
advantage over a semiconductor tapered-amplifier-based lattice, where 
spontaneous noise pedestals might cause additional frequency shifts’. 
For Srl, we used a tapered-amplifier system, but we refined the output 
spectrum with a narrow-band interference filter and an optical filter 
cavity. To deal with potential residual shifts due to the tapered-amplifier 
noise pedestals, we regularly calibrated the lattice Stark shift for Srl. 
Both clocks stabilize their lattice laser frequencies to a Cs clock via a 
self-referenced Yb fibre comb, and their trapping light intensities were 
stabilized after being delivered to the atoms. The lattice vector shift was 
cancelled by alternately interrogating the +9/2 and —9/2 stretched 
nuclear spin states of the atom on successive experimental cycles, in addi- 
tion to the use of linearly polarized lattice light*’*. This interrogation 


Table 2 | Uncertainties for the in-vacuum silicon diode thermometer 


Corrections AT(mk) or(mk) 
Calibration (including self-heating) ) 16 
Residual conduction 0 0.7 
Temperature gradient 40 20 
Lead resistance 7.7 1.5 
Lattice light heating =15 75 
Totals 32.7 26.7 


All uncertainties are quoted as lo standard errors. The absolute calibration of the silicon diode sensor 
(including the self-heating effect) was performed by the vendor (Lake Shore Cryotronics) and the 
calibration is traceable to the NIST blackbody radiation standard. We have evaluated corrections and 
their uncertainties for the operation of the sensors in our vacuum chamber, including the residual 
conduction by the mount, extra lead resistance and lattice light heating. 
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chamber as 301 K and varying the bottom window temperature (shown in 
the top horizontal axis) shows that measuring the total radiative power is 
sufficient for our quoted BBR systematic uncertainty. The bottom horizontal 
axis displays the temperature difference between the atoms and the retracted 
sensor. The inset is a typical measured temperature difference inside the 
vacuum chamber referenced to the temperature of the retracted movable sensor 
at the beginning of the measurement. Green diamond, retracted position; 

red square, atomic position. 


sequence also allowed cancellation of the first-order Zeeman shift. 
Rather than trying to separate the scalar and tensor shifts artificially, 
we treat them as a single effect in our measurement of the a.c. Stark 
shift”®. (In reference to alternating (or direct) current, a.c. (or d.c.) is 
used to denote oscillatory (or static) fields and their effects.) We further 
minimized the tensor shift’s sensitivity to the magnetic bias field B 
by setting the lattice polarization and the direction of B to be parallel. 
When modulating the intensity of the lattice, we did not identify any 
lattice shifts that are nonlinear in lattice intensity. Specifically, we elim- 
inated systematic biases arising from differential atomic interaction shifts 
and optical spectrum shifts from the a.c. Stark effect. A Fisher test per- 
formed for various model shifts on an extensive set of data (shown in 
Fig. 3a) demonstrated that the lattice shift is consistent only with a linear 
model to within 1¢ uncertainty (see Methods). 

For SrII we also took extra care to minimize fluctuations in the magnetic- 
field-related lattice Stark effect and the second-order Zeeman shift”. 
To stabilize the magnetic field for our clock over long operational periods, 
we used the atoms themselves as a collocated magnetometer for the 
clock. Every two minutes during the clock operation, the computer- 
based frequency locking program was paused to interrogate unpolar- 
ized atomic samples under zero applied magnetic field. A drift in the 
background magnetic field resulted in a reduced excitation for the peak 
of an unpolarized line, because all ten nuclear spin states will experi- 
ence different Zeeman shifts. Every time the magnetic field servo was 
activated, the program automatically dithered each pair of magnetic 
field compensation coils (along three orthogonal spatial directions) and 
optimized the current for each pair of coils. As shown in Fig. 3b, the 
magnetometer-based feedback loop not only keeps the field direction 
constant throughout the clock operation, but also automatically nulls 
the background field without an operator’s intervention. For Srl, this pro- 
cedure is unnecessary owing to its more stable magnetic field environment. 

To push the systematic evaluation to the 10° '* level, we also needed 
to evaluate the d.c. electric-field-induced Stark effect, a frequency shift 
mechanism caused by patch charges immobilized in the vacuum chamber’s 
fused silica viewports”’. A pair of disk-shaped electrodes was placed 
near the two largest viewports in the system (separated along the ver- 
tical direction) and shifts were recorded as the electrode polarity was 
switched. Differences between the frequency shifts induced by oppo- 
sitely charged electrodes indicate the presence of stray background 
electric fields, as shown in Fig. 3c. On first measuring the d.c. Stark 
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Figure 3 | Examples of systematic evaluations. a, To determine the lattice a.c. 
Stark effect accurately, a variety of lattice depths were used. This effect is 
depicted as a function of the differential lattice depth with binning 

chosen for figure clarity (average bin size 68 min, corresponding to an average 
of 1,600 points). Within our measurement precision, the best fit is 

a linear model. Grey circles denote mean frequency shifts and small solid black 
lines represent the 1o standard errors inflated by the square root of 

the reduced chi-squared. The solid green line is a linear model 

and the light green patch represents the lo standard error for this model. 

b, Using the atomic cloud as a collocated magnetometer, a residual 

non-zero magnetic field is inferred via the peak excitation of 

an unpolarized Rabi lineshape. The left panel shows the servo action 

of zeroing the residual magnetic field. The right panel shows a clock transition 
lineshape for an unstabilized magnetic field (red open circles) and an 
improved lineshape under the stabilized magnetic field (blue filled circles). The 
red and blue solid lines are simply guides for the eye. c, Measurements 

of d.c. electric-field-induced Stark shift show a quadratic behaviour. The red 
circles show that a residual shift due to the stray d.c. field was —1.3 X 10° 1°. 
The blue squares show a greatly reduced shift after purging the vacuum 
chamber with N, gas. Dashed lines show a quadratic fit to the data. Solid black 
lines represent the 1o standard errors inflated by the square root of the reduced 
chi-squared. Solid red and blue vertical lines show the locations of zero net 
electric field. 


74 | NATURE | VOL 506 | 6 FEBRUARY 2014 


effect on Srll, a residual, stable —1.3 X 107 '° shift was discovered. 
However, when the vacuum chamber was filled with clean nitrogen 
and then re-evacuated, we reduced the measurable d.c. Stark effect to 
—1.6(1.0) X 10° '*. To complete the full evaluation of the d.c. Stark 
effect, we performed similar measurements along the horizontal direc- 
tion and determined its effect at —1.9(1.9) X 10 '°. 

The outlook for optical lattice clocks is bright. We note that this is 
only the first systematic evaluation of a lattice clock enabled by the new 
generation of stable lasers, which led to clock stability near the quantum 
projection noise limit for 1,000 atoms. As laser stability continues 
to improve*’, Sr and other lattice clocks will increase their quantum- 
projection-noise-limited precision with larger numbers of atoms. Along 
with great advances in stability, the systematic uncertainty for such clocks 
will rapidly decrease owing to much reduced measurement times. Hence, 
the stability and total uncertainty of future lattice clocks will advance 
in lockstep. The techniques demonstrated here will allow for clock 
stability and total uncertainty below 1 X 10° '®. Such clocks will in turn 
push forward a broad range of quantum sensor technologies and facil- 
itate a variety of fundamental physics tests. 


METHODS SUMMARY 


For both clocks, a few thousand ®’Sr atoms are laser cooled to around 3 pK and 
trapped in one-dimensional optical lattices near the magic wavelength (813 nm), 
with trap depths ranging from 40E, to 300E, (where E, is the photon recoil energy). 
A thermal-noise-limited laser with a short-term stability of 1 X 10~"° (from 1s to 
1,000 s) interrogates the 'S)—*P, clock transition with Rabi spectroscopy for 160 ms. 
The clock comparison is normally operated in an asynchronous interrogation 
mode, where the two clock probe pulses are purposely non-overlapping in time’. 
Two independent acousto-optic modulators (AOMs) are used to correct the laser 
frequency to the Srl and SrII clock transitions. State detection of atomic ensembles 
is a destructive measurement that requires the repetition of the experimental cycle 
every 1.3 s. After each cycle, the frequency corrections, atom numbers and envir- 
onmental temperatures for both systems are recorded and time stamped. For 
evaluation of the systematic uncertainty of the clock frequency, we focus on a 
few dominating effects such as the blackbody radiation, atomic interaction, lattice 
Stark shift and magnetic field, as well as a range of other sources of uncertainties 
such as the d.c. Stark shift, the clock laser a.c. Stark shift, line pulling and lattice 
tunnelling effects, AOM phase chirp, the second-order Doppler effect, back- 
ground gas collisions and atomic servo errors. All experimentally measured quan- 
tities are treated with rigorous statistical analysis. Long time records of data are 
binned into various sizes of time windows, producing means and standard devia- 


tions of these bins along with the reduced chi-squared, ,/7?.4,..4- When 


y2 
Xreduced 


are scaled up to bring ,/72,4yceq to 1, and the analysis to determine the systematic 
is repeated. 


> 1, indicating overscatter in data, the smaller bins’ standard deviations 
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METHODS 


Atomic sample preparation. For both clocks, up to a few thousand *’Sr atoms are 
laser cooled to a few microkelvin and trapped in one-dimensional optical lattices 
near the magic wavelength (813 nm), with trap depths ranging from 40E, to 300E,. 
Here E, is the photon recoil energy. A thermal-noise-limited laser with a short- 
term stability of 1 X 10 '° (from 1 s to 1,000 s) interrogates the 1S9-*Po clock tran- 
sition with Rabi spectroscopy over a 160-ms probe time. We allow a sufficient time 
for transient perturbations in the system to decay before we interrogate the clock 
transition. We normally operate the clock comparison in an asynchronous inter- 
rogation mode, where the two clock probe pulses are purposely non-overlapping 
in time’. Two independent frequency shifters (acousto-optic modulators (AOMs)) 
are used to correct the laser frequency to the SrI and SrII clock transitions. State 
detection of both atomic ensembles is a destructive measurement that requires the 
repetition of the experimental cycle every 1.3s. After each cycle, the frequency 
corrections, atom numbers, and environmental temperatures for both systems are 
recorded and time stamped for comparison and post-processing. 

Statistical methods for data analysis. For all systematic measurements, residual 
non-white noise introduces overscatter in the data. Following our previously reported 
procedure”, the data are first binned into smaller chunks, the means and standard 


deviations of these bins are determined, and a reduced chi-squared, Vater r is 


obtained. In instances where , / he duced > J» indicating overscatter in the data, the 


smaller bins’ standard deviations are inflated to bring Ven duceq t© 1, and the 
analysis to determine the systematic is repeated. This conservative approach is 
applied to all measurements in this Letter unless otherwise noted. 

Previously, the most comprehensive systematic evaluations of optical lattice clocks 
with total systematic uncertainty better than that of Cs were reported in refs 8-10 and 
31. During the production of this manuscript, another systematic evaluation of Sr 
from the Physikalisch-Technische Bundesanstalt group has been released’. Below, 
we provide a detailed discussion of the systematics we evaluated in this work. 
Blackbody radiation shifts. The blackbody radiation shift is determined by 
Avppr = —2.13023(T/300)* - 0.1484(T/300)°, where T is the temperature in K 
and Avgpp has units of Hz. The T* term is known as the static shift, and the T° 
term is called the dynamic shift. To ascertain the radiative temperature experi- 
enced by the atoms, silicon diode temperature sensors were installed in Srl. 
Silicon diode sensors are used for their ease of calibration (because their forward 
voltage drop is linear in temperature) and their suitability for vacuum baking*’. We 
investigate the thermalization of the probes by modelling their heat transfer. For 
the small thermal gradients measured around our chamber, the sensors give a good 
measurement of the integrated BBR spectrum, which is proportional to the static 
BBR shift experienced by the atoms. 

The dynamic shift, which depends on the frequency-weighted spectrum of radi- 
ation experienced by the atoms, is calculated using the temperature read out by the 
sensor. The T° coefficient associated with this shift was chosen to be a simple mean 
between the two most recent publications and the uncertainty in this coefficient 
was chosen to be their difference”. This coefficient uncertainty is the dominant 
BBR uncertainty. To understand the error we accrue by calculating the dynamic 
shift with the sensor reading, a ray-tracing model of our chamber was constructed. 
Refraining from using a Monte Carlo population of rays, we used Hammersley 
boundary points to construct the ray population in a controlled, repeatable and 
processor-efficient manner. Effective emissivity-weighted solid angles are then 
tabulated according to whatever position inside the main chamber the end user 
requires. By keeping track of the approximate blackbody radiation spectrum at any 
point in the chamber and using the relevant Einstein A coefficients of Sr in ref. 22, 
the error associated with assuming a perfect BBR spectrum was determined. The 
model’s results, as seen in Fig. 2c, show that this error is well below our quoted sys- 
tematic uncertainty for BBR under small temperature variations around our chamber. 

Within our BBR shielding box with small temperature gradients, the dominant 
emissivity-weighted solid angle of the vacuum viewports makes the model’s pre- 
dictions for the dynamic BBR correction insensitive to the exact emissivity values. 
For example, for small temperature differentials (3 K) across our chamber, the emis- 
sivity of the metal would need to change by a factor of 20 to introduce a 1 x 10°" 
error into the model’s predictions. 

However, much hotter components that couple more strongly to the atoms, 
such as the heated Zeeman slower window, introduce a larger error in the deviation 
from a perfect BBR spectrum. For the systematic uncertainty evaluation, the oper- 
ation of the Sr optical lattice clock can be performed without heating the Zeeman 
slower window for a limited amount of time. Even with the Zeeman slower window 
heated, its contribution to total uncertainty is below 1.2 X 107 '*. For future experi- 
ments that require total uncertainty below 1 X 10~'%, we can simply add mech- 
anical shutters that obscure the atoms’ view of all hot elements in the system. Srl’s 
temperature is a simple weighted mean, based on a set of temperature measurements 
made at various points on the chamber. The weights for this mean are derived from 


the emissivity-weighted solid angles the atoms experience from various compo- 
nents. We conservatively quote the full range of temperature across the SrI cham- 
ber (0.7 K) as the SrI temperature uncertainty. 

Atomic density shifts. For both SrI and SrII, the atomic density shift is measured 
via self-comparison by modulating the lattice trapped atom number during sub- 
sequent measurements of the line centre. The fast modulation timescale makes 
these measurements immune to long-term atom number drifts. Extrapolation of 
the density shift from changes in atomic density due to changes in the lattice trap- 
ping potential is used only when trap frequencies have changed by less than 10%. Any 
additional error from this extrapolation is included in the final quoted uncertainty. 
Magnetic field effects. The first-order Zeeman shifts are cancelled by alternately 
interrogating opposite nuclear spin stretched states. After each line centre acquisi- 
tion of the clock laser, the quoted Sr frequency is the running average between the 
two previous measurements. Key to the efficacy of this scheme is that changes in 
the magnetic field over two line centre acquisitions are either negligible or aver- 
aged away. The residual first-order Zeeman shift is calculated by examining the 
overall drift of the magnetic field splitting between the two stretched nuclear spin 
states, and extrapolating the inherent shift this drift would induce. 

The second-order Zeeman shifts, caused by the presence of an applied bias anda 
residual magnetic field during the clock interrogation sequence, are subtracted off 
point by point. The determination of the second-order Zeeman shift coefficient is 
made via a fast modulation experiment whereby four digital locks are modulated 
in sequence: two locks at a high magnetic field, one for a measurement of each stretched 
spin component, and two locks at a low magnetic field. This experiment was per- 
formed for stretched states’ splitting ranging from 300 Hz to 1,200 Hz. To fit this data, 
a quadratic function, cS?, was used. Here c was found to be —0.248(2) x 10°°Hz}, 
and S is the measured stretched state splitting in Hz. Extra care was taken to ensure 
that modulation of the applied bias field did not cause any rotation of the polar- 
ization axis, which would induce an unwanted differential lattice tensor shift. The 
second-order Zeeman shift uncertainty is quoted for a 500-Hz splitting between 
the mp = +9/2 stretched states, which allows for a relatively strong bias magnetic 
field (about 50 L1T) to be applied to the atoms during clock operation. 

During the course of the previous measurement, instabilities in the magnetic 
field environment of SrlI were discovered. Active control of the magnetic field was 
implemented to combat this. As detailed in the main text, the peak excitation of an 
unpolarized line is used as a measure of the residual magnetic field in the system. 
Once the clock has entered into the servo routine, it takes three peak excitation 
measurements for each coil at different currents (Ip — AI, Ip, Ig + AL, where Ip is the 
current of the previous iteration and AJ is a small trial step). In situations where the 
residual field is far from zero, the software steps the current by a fixed amount in 
the direction indicated by the increasing excitation. When the field is near zero, a 
parabola was fitted between the three measurements and the current was stepped 
to the fitted value filtered via a low-pass finite impulse response filter. On stopping 
the servo routine and measuring the unpolarized line, the residual magnetic field 
wander was measured to be less than 0.3 pT. 

Lattice Stark effects. To achieve the low uncertainties reported for SrII, a variety 
of new techniques were used to minimize any systematic errors in the measure- 
ment of the lattice stark effects. The intensity servos were implemented with a 
liquid-crystal waveplate for SrI and an acousto-optic modulator for SrII. Trap 
frequency measurements were performed at each measured a.c. Stark point via 
a high-resolution sideband scan. As explained in ref. 35, radial motion only brings 
the sidebands closer to the carrier, so the true longitudinal sideband frequency is 
found by looking at the farthest edge of the blue sideband. This edge corresponds 
to the contributions of atoms that are distributed at the centre of the Gaussian 
profile for the lattice beam. A tangent line was fitted to the inflection point of the 
Lorentzian, and then the Lorentzian centre frequency could be extrapolated from 
the tangent x-intercept and the Lorentzian linewidth determined by the Fourier- 
limited linewidth of the clock laser scan. A scan of the carrier was taken simulta- 
neously with these high-resolution sideband scans, and a fit to the carrier was used 
to determine the Lorentzian linewidth. Once the longitudinal trap frequency was 
determined, the trap depth could be extracted using the procedures outlined in 
ref. 35. Alternating between four atomic servos, with the magnetic field control 
activated, we measured differential shifts between a variety of high and low lattice 
depths from 87E, to 300E, for the SrII apparatus. To minimize systematic uncer- 
tainties caused by differential atomic density shifts, both high and low lattice depths 
were operated with an absolute density shift below 1 X 10” '”. Atomic interaction 
shifts follow a power-law behaviour in the trap frequency, and must be taken into 
account especially at very high lattice depths. Atom numbers were chosen to 
provide similar density shifts for both high and low lattice depths. Furthermore, 
individual measurements for particular values of lattice depth difference were 
performed at 1 X 10” statistical uncertainty. Many measurements at different 
lattice depths were needed to achieve the low uncertainty reported here (see 
Fig. 3a). Raw measurement data are binned in sets of 30 to 75 points per bin 
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according to the procedure outlined in the ‘Statistical methods for data analysis’ 
section. To decrease the sensitivity to bin choice, the final uncertainties for the fit 
parameter are simple means of the fit parameter errors determined for all bin sizes. 
Model fits for both a hyperpolarizability and an E2/M1 contribution to the mea- 
sured stark shift revealed no statistically significant contribution, based on a 
Fisher test with a lo threshold. Even at this measurement precision, the small 
magnitude of these shifts allowed their contributions to be included in the linear 
fit. Our fitted hyperpolarizability, 0.48(47) 1. Hz/E,’, is not inconsistent with prev- 
iously reported coefficients? of 0.45(10) Hz/E,”. However, our data, as shown by 
the Fisher test, support a linear fit only. We thus list only a single overall systematic 
uncertainty for the lattice a.c. Stark shift, treating the entire data set in the most 
statistically consistent manner. We note that using a prior reported hyperpolariz- 
ability coefficient’, our overall a.c. Stark uncertainty would change only slightly 
from 3.7 X 1078 to 4.1 X 107 }8. However, in that previous measurement, to gain a 
very large lever arm for the measurement of the hyperpolarizability, very high 
(5,000£,) lattice depths were used’, but the atomic density effects in such tight 
traps were not considered. In summary, although our data are not inconsistent 
with the hyperpolarizability measurements previously reported’, this work is not 
an independent verification of the previous measurement. Our data alone do not 
support a statistically significant nonlinearity, and the extracted hyperpolarizabil- 
ity would have a higher uncertainty than the previous measurement. 

Although, to first order, lattice vector shifts are cancelled by alternately mea- 

suring opposite nuclear spin stretched states, a residual lattice vector shift can cause 
systematic shifts due to its convolution with the second-order Zeeman shift. A 
lattice vector shift will cause an overall widening of the shift between opposite nuclear 
spin states, mimicking a magnetic field. As a conservative estimate, we included the 
effect of a 100-mHz residual lattice vector splitting. To calculate how this will affect 
the uncertainty related to the second-order Zeeman effect, we included this 100 mHz 
as an error to S, as defined in the ‘Magnetic field effects’ section. 
Miscellaneous shifts. Shifts from background gas collisions were estimated using 
the methods described in ref. 36. Differential C, coefficients for the Sr ‘Sy and *Py 
states” for their resonant dipole interactions were scaled to the Cs ground-state- 
ground-state C, coefficients. By far the largest residual gas in our ultrahigh-vacuum, 
oven-loaded system is hydrogen. Both the Sr 'So-H> Cg coefficient and the Sr *Po- 
Hz Cg coefficient were then estimated by scaling with respect to the non-resonant 
dipole Cs-H; Cg coefficient. Atomic trapping lifetimes of about 1 s (SrII) and about 
8s (Srl), average excitations during the detuned Rabi pulse and interrogation times 
of 160 ms were combined to estimate the background gas collisional shift uncer- 
tainty. No shift correction is quoted, and we provide only an upper bound of this 
uncertainty. 

Using the measured temperature of around 3 ulK, we calculate an average total 
velocity of 3cms~’. A Taylor expansion of the full relativistic Doppler shift has 
second-order terms from both longitudinal and transverse motion. Overall, the 
second-order Doppler effect results in a fractional shift less than 10° 7°. 

Line pulling can be caused by a variety of effects. These effects include a slight 
ellipticity in the clock laser polarization, imperfect optical pumping (<5% popu- 
lation in neighbouring nuclear spin states), and clock-laser-induced tunnelling (no 
signal was visible for tunnelling-induced sidebands). They can be modelled as a 
deformation of a perfect Rabi lineshape. The spectral narrowness of the 5-Hz Fourier- 
limited linewidth with which all data was taken in this work greatly reduced the 
effect of any possible line pulling. 

The only exception to taking data with spectrally narrow features was the inves- 
tigation of the a.c. Stark shift induced by the clock laser itself. This systematic was 
evaluated by measuring the frequency difference between our clock transition 
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interrogated with 50-ms and 200-ms 7 pulses using our fast modulation technique. 
For this measurement, the clock was run with our largest possible bias magnetic 
field to avoid any residual line pulling effects during the 50-ms clock interrogation. 
Including the errors in determining our 7 pulse exactly, we estimate a 1.3 X 10° "8 
uncertainty in the probe beam a.c. Stark shift for SrII’s normal clock operation. 

An AOM is used to scan the *’Sr clock transition and to shape our laser pulse. 
When the clock pulse is switched on, phase transients originating from this AOM 
can cause a measured frequency shift. We compared light from a diffracted order 
of this AOM with the zeroth-order light using a digital phase detector. After we 
calibrated and removed the effect of the detector’s phase transients from our data, 
the observed effects, when convolved with the sensitivity function of Rabi spec- 
troscopy, resulted in a shift almost consistent with zero”. 

Servo error was determined by combining many hours of lock data and mea- 
suring whether there is a systematic bias to the in-loop error signal. Any bias mea- 
sured was transformed to a frequency shift and uncertainty by modelling a perfect 
Rabi lineshape with the contrast and pulse area under which the data was taken. 
This allowed us to transform error signal bias to frequency shifts. 

Frequency comparison. After the data had been post-processed by each individual 
strontium team, time-stamped and corrected frequencies were shared. Although 
the overall systematic uncertainty of the comparison is 5.4 X 10° '’, asa consistency 
check for the comparison a variety of methods were used to show the agreement of 
the two clocks within this confidence interval. A simple mean of all the data gives 
the difference between the two clocks to be —2.4 X 10” '”. Binning the data in small 
chunks, of approximately one minute per data point (as in the top panel of Fig. 1b) 
gives agreement of —2.8(2) x 1071’. The uncertainty on this number has been 
inflated by 4/72 .auceas Decause 4/¥ 2 juceq = 3-5 denotes overscatter in the data. 
Binning the data in 30-min chunks (as in the bottom panel of Fig. 1b) clearly 
shows that there are systematic fluctuations still present in the comparison, with 

Ladnced = 10.5 and an agreement of —2.7(5) X 10°”. Again, this uncertainty is 
inflated by 4/7? auceq: Lhe greater overscatter in the data at longer timescales is 
probably caused by imprecise knowledge of the BBR environment for Srl, which 
allows for fluctuations within the 1o comparison uncertainty. 

The final systematic uncertainty used in the comparison is quoted under the 
running conditions of the two strontium systems during the comparison, and not 
their final best achieved total uncertainties. Furthermore, the height difference (10 cm) 
between the two atomic clouds, resulting in a 1.0 x 10°” gravitational redshift, 
was included in the comparison but was not relevant for Table 1. 
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Drought sensitivity of Amazonian carbon balance 
revealed by atmospheric measurements 


L. 
V. 


Feedbacks between land carbon pools and climate provide one of the 
largest sources of uncertainty in our predictions of global climate’. 
Estimates of the sensitivity of the terrestrial carbon budget to cli- 
mate anomalies in the tropics and the identification of the mechan- 
isms responsible for feedback effects remain uncertain**. The Amazon 
basin stores a vast amount of carbon’, and has experienced increas- 
ingly higher temperatures and more frequent floods and droughts 
over the past two decades*. Here we report seasonal and annual 
carbon balances across the Amazon basin, based on carbon dioxide 
and carbon monoxide measurements for the anomalously dry and 
wet years 2010 and 2011, respectively. We find that the Amazon 
basin lost 0.48 + 0.18 petagrams of carbon per year (Pg Cyr_') 
during the dry year but was carbon neutral (0.06 + 0.1 Pg Cyr” ') 
during the wet year. Taking into account carbon losses from fire by 
using carbon monoxide measurements, we derived the basin net 
biome exchange (that is, the carbon flux between the non-burned 
forest and the atmosphere) revealing that during the dry year, vege- 
tation was carbon neutral. During the wet year, vegetation was a net 
carbon sink of 0.25 + 0.14 Pg Cyr’_', which is roughly consistent with 
the mean long-term intact-forest biomass sink of 0.39 + 0.10 Pg Cyr” 
previously estimated from forest censuses’. Observations from Ama- 
zonian forest plots suggest the suppression of photosynthesis dur- 
ing drought as the primary cause for the 2010 sink neutralization. 
Overall, our results suggest that moisture has an important role in 
determining the Amazonian carbon balance. If the recent trend of 
increasing precipitation extremes persists®, the Amazon may become 
an increasing carbon source as a result of both emissions from fires 
and the suppression of net biome exchange by drought. 

To observe the state, changes and climate sensitivity of the Amazon 
carbon pools we initiated a lower-troposphere greenhouse-gas sam- 
pling programme over the Amazon basin in 2010, measuring bi-weekly 
vertical profiles of carbon dioxide (CO), sulphur hexafluoride (SF.) 
and carbon monoxide (CO) from just above the forest canopy to 4.4km 
above sea level (a.s.1.) at four locations spread across the basin (Fig. 1). 
Repeated measurements of the CO, mole fraction in the low to mid- 
troposphere have the ability to constrain surface CO; fluxes at regional 
scales (about 10°-10° km?) including all known and unknown processes. 
This is in contrast to small temporal*? and spatial’®" scale atmospheric 
approaches, which need substantial and difficult-to-verify assumptions 
to scale up; it is also in contrast to basin-scale surface-based studies, 
which include only a subset of relevant processes*'*"”. 

Our selection of sites reflects the dominant mode of horizontal air 
flow at mid- to low-troposphere altitudes across the Amazon basin, with 
air entering the basin from the equatorial Atlantic Ocean, sweeping 
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over the tropical forested region towards the Andes and turning south- 
wards and back to the Atlantic (Fig. 1). Air at the end-of-the-basin sites 
Tabatinga (TAB) and Rio Branco (RBA) is thus exposed to carbon fluxes 
from a large fraction of the basin’s rainforest vegetation. Flux signatures 
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Figure 1 | Station’s region of influence (‘footprint’). The combined 
sensitivity of all observed atmospheric CO, concentrations to surface fluxes 
(that is, measurement ‘footprints’) is shown for the four sites TAB, RBA, SAN 
and ALF (solid black dots). Sensitivity is given in units of concentration (p.p.m.) 
per unit flux (umolm~ 7s 1). As seen in Extended Data Fig. 6a, footprints 
from the four sites overlap substantially. Footprints are calculated at 0.5-degree 
resolution using ensembles of stochastically generated back trajectories using 
the FLEXPART Lagrangian particle dispersion model and then calculating 
the residence times of these back trajectories in the 100 m layer above the 
surface. Values above 0.001 p.p.m. umol 'm~*s ' comprise 97% of the land 
surface signal and values above 0.01 p.p.m. umol_'m~*s ' comprise 50% 

of the land surface signal; thus apparently small values are still important 
because they occupy a large area. Black arrows represent average climatological 
wind speed and direction in June, July and August (from the National Centers 
for Environmental Prediction (NCEP); http://www.esrl.noaa.gov/psd/data/ 
gridded/data.ncep.reanalysis.html) averaged between the surface and 

600 mbar. Open symbols (RPB and ASC) represent the NOAA tropical Atlantic 
sites used to define the background concentrations of CO2, CO and SF, coming 
into the Amazon basin. Solid green dots indicate the locations of forest plot 
clusters where long-term biomass gains and respiration have been observed. 
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in air at the other two sites, Alta Floresta (ALF) and Santarém (SAN) 
are not only from forests but also from savanna and agricultural land. 
Our measurements represent the first network of ongoing, well-calibrated 
CO, measurements over a large stretch of tropical land. Such mea- 
surements are vital, because the near-absence of CO, measurements 
sensitive to the tropical biosphere is the underlying cause of the large 
uncertainties in net flux estimates for tropical regions obtained by inverse 
modelling of atmospheric CO, (refs 14 and 15). 

Fortuitously, the two years of atmospheric observations reported here 
are for an unusually dry year followed by a wet one (Fig. 2 and Extended 
Data Fig. 1a, b). Our measurements thus document the sensitivity of 
Amazon basin carbon pools to the effect of drought. The reasons for 
the dry conditions in 2010 were twofold. For the first three months an 
El Nifo episode caused dry conditions in the north and centre of the 
Amazon basin, whereas during the second half of the year a positive 
North Atlantic sea surface temperature anomaly locked the inter-tropical 
convergence zone (where the northeast and southeast trade winds con- 
verge) into a position that was more northerly than usual. This caused 
enhanced and prolonged dry conditions in the southern areas of the 
Amazon basin (Extended Data Fig. 1a, b). A simple diagnostic of the 
stress on vegetation exerted by the negative precipitation anomalies is 
the climatological water deficit (CWD"*; see Methods and Fig. 2), in 
which in 2010 large negative anomalies occurred for the northwestern 
basin. This is consistent with river discharge records’’. Lesser negative 
anomalies in the northeastern basin were caused by early-year negative 
precipitation anomalies and the central-eastern and southern parts of 
the Amazon basin (‘the arc of deforestation’) had anomalies caused by 
low precipitation during the third quarter of the year. Monthly mean 
temperatures (Extended Data Fig. 1c, d) in 2010 were higher than aver- 
age in every month, with especially large anomalies in February/March 
and August/September. These mirror the periods of greatest negative 
precipitation anomalies. Warmer than average temperatures (with respect 
to the last three decades) were also observed for every month of 2011, 
but 2011 was also an unusually wet year (Extended Data Fig. 1a, b). As 
shown below, observed basin-wide carbon flux variations for 2010 and 
2011 reflect these temporal precipitation patterns. 

To isolate the contribution of Amazon terrestrial carbon sources and 
sinks to the atmospheric CO, profiles, we first subtract a scalar back- 
ground mole fraction from each of the observed profiles. This background 
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Figure 2 | Climatological water deficit. a, Basin-wide averages and standard 
deviation of CWD, based on the Tropical Rainfall Measuring Mission”. b, Fire 
counts based on European Space Agency (ESA; http://due.esrin.esa.int/wfa/) 
fire count data” for 2010, 2011 and 1998-2011, respectively. 
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represents the composition of air entering the Amazon basin from 
the Atlantic and is estimated as a weighted average of CO, at Ascen- 
sion Island (ASC) and Ragged Point, Barbados (RPB) using a linear 
mixing model based on ASC and RPB SF, with weights determined 
from SF, measured at the site’* °° (Methods). SF, is well suited for this 
purpose (that is, to estimate the fractional contributions of Northern 
and Southern Hemispheric air entering the basin) because it has a large 
inter-hemispheric difference (Extended Data Figure 8) and virtually no 
Amazonian emissions”. 

Carbon sources and sinks reveal themselves in the referenced pro- 
files AX = Xgite — Xpg as mole fraction enhancements and depletions, 
where X is the mole fraction of CO, or CO, for site and background. 
The enhancements and depletions are generally confined to the low- 
ermost 2 km or so of the profiles (Fig. 3). For ACO, (Fig. 3a—d), there is 
a strong tendency towards surface enhancements during the dry sea- 
son, although both lower-troposphere depletions and enhancements 
can be observed at any time of the year. Vertical profiles of ACO show 
very large enhancements above the Atlantic background in the dry 
season, persisting into the free troposphere (Fig. 3e-h and Extended 
Data Fig. 2). CO is a product of incomplete combustion and in the 
Amazon it reflects a contribution to COz enhancements from biomass 
burning. This is confirmed by calculated air-mass back-trajectories 
intersecting satellite-sensed fire hotspots (Extended Data Fig. 3) and 
by our observed CO:CO, ratios, which are typical for those from trop- 
ical forest fires (Methods). 

From the profiles of AX we estimate fluxes by dividing them by the 
air-mass travel time t from the coast to the sampling site and integ- 
rating from the surface (0 km above ground level, a.g.l.) to 4.4km a.s.l. 
determined by air-mass back-trajectories calculated separately for each 
of (typically) 12 air samples per profile’*”° to obtain: 


44kma.s.l. 
Fy= 


z= 0km (a.g.l.) 


AX 


Using measured CO:CO, emission ratios, fan Bi (refs 9 and 20), we 
further estimate the biomass burning contribution CF ) to the net car- 
bon flux using: 


F a = 7b CO, (F co —F es) (2) 


where pe is the stable (background) value of Foo during the wet 
season”, reflecting direct plant and soil CO emissions as well as pro- 
duction from rapid oxidation of biogenic volatile organic compounds”. 


The non-fire net biome exchange (NBE) flux Bone is then given by: 


NBE __ ptotal _ bb 
FCO, =Foo, Feo, (3) 


Our flux calculations (Fig. 4 and Table 1) reveal basin-wide average total 
fluxes of 0.19 + 0.07gCm 7d 'in2010and0.02 + 0.04gCm *d~ 
in 2011. Riverine carbon outgassing” is included in these fluxes but con- 
tributes minimally because the riverine organic carbon loop is very nearly 
closed within the Amazon basin??, and fossil fuel emissions in the basin 
are negligibly small (<0.02 Pg C yr '; see Methods). Flux uncertainties 
presented in Fig. 4 and Table 1 may be underestimates because of losses 
of surface signal above 4.4 km caused by convective processes not cap- 
tured by our extrapolation technique (Extended Data Table 1a). Our 
imperfect knowledge of convection and the difficulty of measuring CO, 
in the upper troposphere hamper quantification of these errors. 
Using a basin area of 6.77 X 10° km? we calculate a source to the 
atmosphere of 0.48 + 0.18 Pg C in 2010. In contrast, 2011 displayed an 
approximately neutral carbon balance (0.06 + 0.10 PgCyr’ '). In 2010, 
we calculate carbon losses due to fires of 0.51 + 0.12 PgC yr _', imply- 
ing a carbon-neutral residual (that is, approximately zero NBE). On the 
other hand, for 2011 when NBE was —0.25 + 0.14Pg Cyr 1 the overall 
carbon balance was neutral, because this was offset by fire-associated 
losses of roughly the same size (0.30 + 0.10 PgCyr *). The return of 
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Figure 3 | Surface flux signals in vertical profiles. a-d, Mean difference 
between CO, profiles measured in 2010 at the four Amazonian aircraft 
sampling sites and oceanic CO, background (that is, ACO.) during the dry 
(red lines) and wet (blue lines) seasons, respectively (solid lines) and the 
standard deviation divided by the square root of number of profiles 

(dashed lines). The background is estimated from in situ SF; and CQ) at the 


the unburned Amazonian vegetation to being a sink in 2011 seems to 
have been driven primarily by precipitation, which changed from a nega- 
tive anomaly in 2010 to a positive anomaly in 2011 (Extended Data 
Fig. la, b). However, temperatures were higher than average for both 
years, reflecting a net warming trend in recent decades (Extended Data 
Fig. Ic, d). 
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e-h, As for a-d, but for CO. p.p.t., parts per trillion. The dry season (red lines) 
is affected by fires at most sites and is here defined as July-October for 
illustrative purposes only; it does not correspond to all months with fire 
emissions (see Methods). 


A more detailed picture of the Amazonian carbon cycle response to 
climate is revealed by the quarterly fluxes and by focusing first on RBA, 
TAB and ALF. For both years, during the first quarter of the year (the 
start of the wet season), measurements indicate a net carbon sink, and 
during the second and drier half of the year, measurements indicate a 
net source (Fig. 4a). However, during the second quarter of 2010 (in 


Figure 4 | Flux estimates summary. Quarterly 
flux and standard error (see Methods) of total 
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Sites TAB RBA SAN ALF 
2010 fluxes (gC m-?d~) Scaled 2010 flux (PgC yr!)+ 
Total 0.15 £0.10 0.17 £0.11 0.33 + 0.50 0.29 £0.15 0.48 £0.18 
Fire 0.13 + 0.05 0.17 + 0.06 0.57 0.45 0.28 + 0.09 0.51 +£0.12 
NBE 0.02 0.11 0.00 +0.13 -0.25 + 0.70 0.01 £0.17 -0.03 + 0.22 
2011 fluxes (gC m 7d) Scaled 2011 flux (PgC yr 1)+ 
Total —0.10 + 0.07 —0.04 + 0.07 0.46 + 0.20 0.24 + 0.06 0.06 +0.10 
Fire 0.08 + 0.03 0.09 + 0.03 0.44+0.51 0.16 + 0.04 0.30 +0.10 
NBE —0.18 + 0.08 —0.13 + 0.08 0.02 + 0.84 0.08 + 0.07 —0.25+0.14 
Area of influence 2.53 3.67 0.59 1.31 
(10° km?)* 
The uncertainties are standard errors calculated by propagating uncertainties in all equations using a Monte Carlo approach, and then taking half the value of the 16th-84th percentile range. A bootstrapping 


approach to calculate the standard error (2.5th-97.5th percentile range) yields slightly smaller values. 


* Back-trajectory ensemble envelope (that is, the total area of influence of a measuring site as estimated from wind back-trajectory ensembles). 
+’Scaled’ means the flux estimates have been scaled to the tropical South America forested area, assuming an Amazon forest area of 6.77 x 10°km? (ref. 30). 


contrast to 2011) we calculate the flux to be a carbon source, which 
slightly lags the strong precipitation and temperature anomalies in 
February and March. Net emissions during the second half of 2010 
were more than twice as large as in 2011, corresponding to precipita- 
tion and temperature anomalies in August and September 2010. For 
both years, however, the difference in carbon release between the sec- 
ond and first half of the year is mainly due to fire emissions (Fig. 4 and 
Extended Data Fig. 2). The larger fire emissions in 2010 are consistent 
with the anomalously high fire counts observed from space (Fig. 2b, 
Extended Data Figure 2) and basin-wide CO anomalies, which in 2010 
extended well above ~2 km az.s.l. (roughly the planetary boundary layer 
height) into the free troposphere, even at the more remote sites RBA 
and TAB (Fig. 3e-h and Extended Data Fig. 2). Moreover, the ‘arc of 
deforestation’ in the southern and eastern Amazon basin was one of 
the regions with the strongest precipitation anomalies (Extended Data 
Fig. la, b), intensifying the meteorological conditions required for fire 
ignition and persistence, and probably leading to the large burning 
emissions we observed in 2010. After accounting for fire emissions, the 
residual NBE reveals large differences between the years, especially for 
the second and fourth quarters, for which there were large carbon 
releases in 2010 but smaller ones in 2011. This difference in seasonality 
between the two years appears to reflect a lagged drought stress induced 
by precipitation anomalies in February/March (first quarter) and August/ 
September (third quarter) of 2010. 

The fluxes calculated from the SAN data differ from the other three 
sites both in seasonality and in the contrast between 2010 and 2011, 
with a strong carbon source in the first quarter of the year for air sam- 
pled upwind of SAN (but not the other three sites) especially notable. 
This may result in part from the fire season extending into January for 
the eastern Amazon and northeast Brazil, which is not the case for the 
moister central/western areas. Additionally, eddy-flux data’ and CO, 
vertical profile analysis*® show that (unburned) forests in the eastern 
Amazon are net sinks in the dry season and net sources in the wet season. 
In contrast, other sites tend to show wet season uptake (Figs 3a—d and 4). 

Additional insight about the cause of the difference in 2010 and 
2011 NBE comes from observations at a network of 14 intensive forest 
carbon cycle measurement plots established across the Amazon basin. 
At these plots a near-complete suite of carbon pools is being observed, 
providing an estimate of net primary production and autotrophic res- 
piration and thus an upper bound on gross primary production’. Six of 
these plots experienced anomalous drought stress in 2010, at which 
time gross primary production declined (Extended Data Fig. 5a), and 
there were minimal positive temperature anomalies (Extended Data 
Fig. 5b). Combined, atmospheric mass balance and forest plot analysis 
suggest that drought has an important negative effect on Amazon forest 
productivity and with likely consequences on future changes in the 
forests. This is in contrast to a recent analysis of future Amazon carbon 
losses calibrated via inter-annual responses of global atmospheric CO2 
growth rates to tropical temperature anomalies”. 


Tropical temperature anomalies have tended to covary with mois- 
ture anomalies in the past, so although these models seem to reproduce 
recent variability correctly they may do so for the wrong reason. More- 
over, as 2011 shows, positive temperature anomalies can also coincide 
with non-drought years. 

Besides the new insights into large-scale controls of carbon pool res- 
ponses in a changing climate, our results provide a top-down confir- 
mation that during non-drought years intact Amazonian forests are a 
substantial carbon sink, consistent with theoretical predictions for forest 
biomass alone’. Our NBE estimate for 2011 is smaller than the mean 
annual biomass sink of 0.39 + 0.10 Pg C estimated for the 1980-2004 
period based on repeated censuses at a widespread forest plot network’. 
However, our fire flux estimate is not identical to the total deforestation 
emissions, which includes emissions from heterotrophic respiration, 
thus slightly biasing our NBE estimate. The Deforestation Carbon Flux 
(DECAF) land-use change model”® suggests that the sources of defor- 
estation emissions in the southern Amazon are typically 30% respiration 
and 70% fire, implying 2011 deforestation fluxes of about +0.4PgC yr ', 
and therefore NBE of about —0.4Pg Cyr ', closing the gap between 
the top-down and bottom-up estimates. In 2011 in particular, respira- 
tion could have been stimulated following enhanced tree mortality 
caused by the 2010 drought’’. 

In summary, we have empirically documented a pronounced res- 
ponse ofa large fraction of the Amazonian vegetation to drought, with 
forest productivity stalled and large amounts of carbon released by fire 
in 2010. The Amazon basin returned to being a net carbon sink in 
2011. But our results are cause for concern in the light of the recent 
increase in precipitation extremes and increasing temperatures. If these 
climate trends continue, future shifts in Amazon forest function, lead- 
ing to reduced carbon uptake, are likely. This could exacerbate carbon 
losses as a result of direct human activities such as deforestation. 


METHODS SUMMARY 


Air sample profiles were taken using small aircraft descending in a spiral from approx- 
imately 4,420 m to about 300 m a.s.l. (as close to the forest canopy as possible), 
semi-automatically filling 12 (for the TAB, ALF and RBA sites) and 17 (for the 
SAN site) 0.7-litre flasks controlled from a microprocessor and contained in one 
suitcase. Profiles are taken between 12:00 and 13:00 local time. At that time, the 
boundary layer is close to being fully developed. Once a vertical profile has been 
sampled (one suitcase filled) it is transported to the IPEN Atmospheric Chemistry 
Laboratory in Sao Paulo, where samples are analysed by a replica of the NOAA/ 
ESRL trace gas analysis system. All aircraft data used in this study is available at 
ftp://ftppub.ipen.br/nature_gatti_etal/. The accuracy and precision of the system 
are evaluated with three independent procedures that demonstrate excellent per- 
formance with long-term repeatability (1c) of +0.03 parts per million (p.p.m.) 
and a difference between measured and calibrated values of 0.03 p.p.m. Because 
NOAA/ESRL Atlantic data from the ASC and RPB sites are used as background 
values for Amazonian measurements made at IPEN, this high accuracy is required 
to ensure that spatial gradients are not artefacts of calibration. The CO and SF, 
measurements presented here are also made at IPEN with calibration standards 
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tied directly to the World Meteorological Organization reference scales main- 
tained by NOAA/ESRL. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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Convective forcing of mercury and ozone in the Arctic 
boundary layer induced by leads in sea ice 
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The ongoing regime shift of Arctic sea ice from perennial to seasonal 
ice is associated with more dynamic patterns of opening and closing 
sea-ice leads (large transient channels of open water in the ice)’, 
which may affect atmospheric and biogeochemical cycles in the Arctic’. 
Mercury and ozone are rapidly removed from the atmospheric bound- 
ary layer during depletion events in the Arctic’-’, caused by destruc- 
tion of ozone along with oxidation of gaseous elemental mercury 
(Hg(0)) to oxidized mercury (Hg(11)) in the atmosphere and its sub- 
sequent deposition to snow and ice’. Ozone depletion events can 
change the oxidative capacity of the air by affecting atmospheric 
hydroxyl radical chemistry*, whereas atmospheric mercury deple- 
tion events can increase the deposition of mercury to the Arctic®?""’, 
some of which can enter ecosystems during snowmelt’”. Here we 
present near-surface measurements of atmospheric mercury and 
ozone from two Arctic field campaigns near Barrow, Alaska. We find 
that coastal depletion events are directly linked to sea-ice dynamics. 
A consolidated ice cover facilitates the depletion of Hg(0) and ozone, 
but these immediately recover to near-background concentrations 
in the upwind presence of open sea-ice leads. We attribute the rapid 
recoveries of Hg(0) and ozone to lead-initiated shallow convection 
in the stable Arctic boundary layer, which mixes Hg(0) and ozone 
from undepleted air masses aloft. This convective forcing provides 
additional Hg(0) to the surface layer at a time of active depletion 
chemistry, where it is subject to renewed oxidation. Future work will 
need to establish the degree to which large-scale changes in sea-ice 
dynamics across the Arctic alter ozone chemistry and mercury 
deposition in fragile Arctic ecosystems. 

Profound changes that have occurred recently in the Arctic sea ice 
include historic minimum extents of perennial sea ice’ and a shift to 
thinner seasonal sea ice”, which experiences more dynamic patterns 
of opening and closing sea-ice leads”. These changes have consequences 
for the Arctic energy balance and the Earth’s radiation budget, with a 
positive feedback that can accelerate Arctic warming’. Here we show 
that atmospheric mercury (Hg) and ozone (O3) depletion events near 
Barrow, Alaska, are directly linked to sea-ice dynamics in the Beaufort 
and Chukchi seas. We performed near-surface measurements of atmo- 
spheric Hg and O3 directly over the frozen Chukchi Sea during two field 
studies: the Bromine, Ozone, and Mercury Experiment (BROMEX)’* 
in March/April 2012, and the Ocean-Atmosphere-Sea Ice-Snowpack 
(OASIS) campaign“* in March 2009 (Fig. 1). We characterized the sur- 
rounding sea-ice conditions with daily Moderate Resolution Imaging 
Spectroradiometer (MODIS) satellite images and marked the location 
of open leads in the path of air masses during the previous 24 hours 
before the air masses arrived at the site. We consistently observed that 
periods of strong and concurrent atmospheric Hg depletion events (below 
0.8 ng m7 *) and O; depletion events (below 5 parts per billion by volume 
(p.p.b.v.)) occurred when upwind areas consisted largely of consolidated 
sea-ice cover (completely frozen or containing fully refrozen leads). Periods 


when air masses travelled over open leads within about 150 km upwind 
of Barrow, however, were associated with higher, undepleted Hg(0) 
and O; concentrations (Fig. 1). 

Using high-temporal-resolution (every 4 hours) National Oceanic and 
Atmospheric Administration (NOAA) Hybrid Single Particle Lagrangian 
Integrated Trajectory Model (HYSPLIT) back-trajectories, we show the 
effects of sea-ice leads on boundary layer Hg(0) and O; for several periods 
associated with dramatic changes in Hg(0) and O; concentrations (Figs 2 
and 3 and Extended Data Figs 1 and 2). The first period in 2012 (Fig. 2) 
shows strong increases in Hg(0) and O; concentrations when back- 
trajectories switched from areas dominated by consolidated sea ice to 
areas with open leads. Initially (cases 1 and 2), back-trajectories travelled 
entirely over consolidated sea ice; and although open leads occurred 
north of Barrow, back-trajectories did not intersect with these within 
the previous 24 hours. Therefore, the open leads were not affecting atmo- 
spheric Hg(0) and O; concentrations for these two cases. During this 
period, Hg(0) and O; concentrations in the Arctic atmospheric bound- 
ary layer were depleted (<0.6ngm * and <15 p.p.bv., respectively), 
indicating an ongoing atmospheric Hg and O; depletion event. A new, 
2-km-wide lead opened northeast of Barrow on 24 March 2012 (case 3). 
Although back-trajectories changed little since the two previous days 
(also supported by consistent wind velocities, see Extended Data Fig. 3), 
they now crossed this open lead and concentrations of Hg(0) and O; dra- 
matically increased to 1.2 ng m_° and 33 p.p.b.v. within hours, approach- 
ing Northern Hemisphere background concentrations (roughly 1.5ngm"* 
for Hg(0) and 30 p.p.b.v. for O3). 

Ina second example period in 2009 (Fig. 3), open leads were present 
close to Barrow on 13 March, but air masses initially travelled over areas 
consisting of consolidated sea ice (case 4). This period was marked by 
decreasing O3 concentrations (starting at 40 p.p.b.v. and decreasing to 
<5 p.p.b.v.), showing an O; depletion event. Early on 14 March, O3 
concentrations increased threefold within 1-2 hours, and Hg(0) was 
correspondingly at near-background concentrations, exactly when the 
back-trajectories crossed a newly opened, 1-km-wide lead northeast of 
Barrow (cases 5 and 6). When this lead refroze on 14 March and the 
trajectories later moved south over consolidated sea ice during the next 
2 days, concentrations of Hg(0) and O3 quickly depleted to near instru- 
ment detection limits (case 7). Again, on 16 March, concentrations of 
O3 quickly increased when air-mass trajectories crossed a newly developed 
lead 20 km from Barrow (case 8). As this lead continued to widen (case 9), 
both Hg(0) and O; remained near background levels. 

Two more time periods when highly dynamic patterns of Hg(0) and 
O; were directly linked to sea-ice lead dynamics are shown in Extended 
Data Figs 1 and 2, demonstrating a total of 15 cases of this interaction. 
Patterns were consistent throughout all periods: when air masses trav- 
elled over consolidated sea ice or refrozen leads, Hg(0) and O3 were 
depleted or showed decreasing concentrations. When air masses crossed 
open leads within the 24 hours before measurements, Hg(0) and O, 


1Division of Atmospheric Sciences, Desert Research Institute, Reno, Nevada 89523, USA. ?Air Quality Processes Research Section, Environment Canada, Toronto, Ontario M3H 5T4, Canada. 3US Army Cold 
Regions Research and Engineering Laboratory, Fort Wainwright, Alaska 99703, USA. “Institute of Environmental Physics, University of Bremen, Bremen 28359, Germany. “Jet Propulsion Laboratory, 


California Institute of Technology, Pasadena, California 91109, USA. 
*These authors contributed equally to this work. 


6 FEBRUARY 2014 | VOL 506 | NATURE | 81 


©2014 Macmillan Publishers Limited. All rights reserved 


LETTER 


a 2012 


— Hg(0) 
=0, 


oe, 


23 Mar 2012 to 24 Mar 2012 


115 km 


Example of A, B and 


19 Mar 22 Mar 


b 2009 


14 Mar 2009 to 15 Mar 2009 


4a 


0.2 


Example of G 0 
10 Mar 


0.8 


Hg(0) (ng m-*) 


0.6 G 
0.4 


13 Mar 16 Mar 


Figure 1 | Time series of Hg(0) and O; concentrations. Concentrations of 
Hg(0) and O; for 2012 (a) and 2009 (b). Yellow boxes are periods when air 
masses crossed upwind areas of consolidated sea ice. Black boxes are periods 


concentrations were not, or were only slightly, depleted. One exception 
to this pattern occurred during a period after 18 March 2009 (Fig. 1), 
when the seasonal sea ice surrounding Barrow was characterized by 
large upwind leads (up to 30 km wide) and showed a complex mixture of 
frozen surface, open and refrozen leads. Both Hg(0) and O; showed 
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when air-mass trajectories crossed open leads. The satellite images represent 
four typical sea-ice conditions that occurred during measurements. Original 
satellite images from Google Earth, Terrametrics. 


dynamic fluctuations between depletion and background levels under 
these conditions, but the temporal and spatial resolution of the satellite 
imagery did not allow us to link open leads directly to Hg(0) and O3 
concentrations during that time. Mean concentrations of Hg(0) and O3 
during the eight distinct periods highlighted in Fig. 1 were statistically 
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Figure 2 | Impact of sea-ice leads on Hg(0) and O; in 2012. Hg(0) and O; 
concentrations between 21 March 2012 and 26 March 2012. Bold numbers 
correspond to time periods as numbered on the corresponding satellite images. 
Satellite images were taken at approximately 16:00 utc (Coordinated Universal 
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Time) each day. Colours represent 24-hour HYSPLIT back-trajectory arrival 
times near Barrow: orange, 04:00 UTC; blue, 08:00 urc; red, 12:00 utc; pink, 
16:00 uTC; yellow, 20:00 utc; black, 00:00 urc (the next day); and purple, 
04:00 urc (the next day). Original satellite images from Google Earth, Terrametrics. 
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Figure 3 | Impact of sea-ice leads on 
Hg(0) and O; in 2009. Hg(0) and O3 
concentrations between 12 March 
2009 and 18 March 2009. Missing 
Hg(0) concentrations are due to the 
analyser being moved between 
locations. Bold numbers correspond 
to time periods as numbered on the 
corresponding satellite images. 
Satellite images were taken at 
approximately 16:00 utc each day. 
Colours represent 24-hour HYSPLIT 


13 Mar 2009 to 
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Open 
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16 Mar 2009 


(P<0.01) significantly lower during periods when air masses were 
unaffected by open leads than during upwind crossing of open leads 
(Extended Data Table 1). 

We considered both chemical and physical processes to explain 
observed linkages between lead dynamics and boundary-layer chem- 
istry. Measurable levels of gaseous bromine species were reported during 
both years of measurements’*”*, consistent with active bromine chem- 
istry’, which is associated with O3 and atmospheric Hg depletion events. 
Depletions of Hg(0) and O3 occurring over consolidated sea ice were 
probably induced by active bromine chemistry in the area’’, although 
this process is not yet fully understood’. 

However, we cannot attribute the recoveries of concentrations within 
1-2 hours of lead openings to chemical processes only. For example, 
bromine monoxide (BrO) concentrations, as identified from the Global 
Ozone Monitoring Experiment-2 (GOME-2)”” spectrometer, showed 
a consistent presence of large BrO clouds in the region. Therefore, the 
patterns of Hg(0) and O3 were unrelated to coincidental patterns at 
the edges of BrO clouds (Extended Data Fig. 4). This was supported by 
direct measurements of BrO concentrations at the site that were not 
significantly correlated with Hg(0) or O3. Even if atmospheric Hg- and 
O3-depletion-event chemistry were to stop on contact with open leads, 
the Hg(0) and O; concentrations would remain depleted for some time 
and would not quickly recover within just a few hours. Sources of Hg 
leading to partial recovery of Hg(0) after atmospheric Hg depletion events 
could include the photochemical reduction of Hg(11) and re-emission 
from surfaces*'*"’, but no such source exists for O3 because O3 is 
destroyed during oxidation. Another possible source for Hg(0) recovery 


back-trajectory arrival times near 
Barrow: orange, 04:00 urc; blue, 
08:00 utc; red, 12:00 uTC; pink, 
16:00 utc; yellow, 20:00 utc; black, 
00:00 uTc (the next day); and purple, 
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satellite images from Google Earth, 
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involves emissions from Arctic Ocean water”’’’. However, this would 
not explain the simultaneous recovery of both Hg(0) and O3 because O3 
is not typically emitted by the ocean. It is also striking that recoveries of 
both Hg(0) and O; consistently reached levels near Northern Hemisphere 
background concentrations, independently of the size of the leads. If 
there were an ocean source of Hg, this would not be expected. 
Weattribute the fast transitions from depleted to non-depleted Hg(0) 
and O; levels to changes in boundary-layer dynamics induced by sea- 
ice leads, which dominate the effects of underlying depletion chem- 
istry. Lead openings generate large sensible and latent heat fluxes from 
the water surface to the atmosphere owing to strong temperature gra- 
dients (more than 20 K) between the warmer ocean water and cold 
polar atmosphere”. This heat transfer causes significant convective 
mixing in the atmosphere directly above and downwind of leads”? 
(see video in ref. 13 of the lead cloud recorded during BROMEX). We 
propose that such convective mixing produces fast recoveries of surface 
Hg(0) and O3 from air masses aloft. Vertical measurements of Hg(0) 
and O; in the stable polar boundary layer have shown that Hg(0) and 
O3 depletions are limited to the surface layer, whereas air aloft is not 
depleted’**”°. We confirmed increased turbulent mixing using radio- 
sonde data from Barrow during several periods when shallow bound- 
ary layers quickly grew in height in the presence of open sea water (for 
example, see Extended Data Fig. 3). It is also unlikely that increased 
wind speed alone—often associated with, and a cause of, opening sea-ice 
leads—would explain Hg(0) and O; recoveries through increased wind 
shear given the periods of Hg(0) and O3; recoveries when wind speeds 
changed little and remained low (below3 ms '; Extended Data Fig. 5). 
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The implications of the observed effects of the dynamics of sea-ice 
leads on atmospheric Hg and O3 are large: the recovery of Hg(0) and 
O3 via convective transport of Hg(0) and O; induced by open leads is 
probably a source of additional Hg(0) and O; to the atmospheric surface 
layer in the Arctic, all other factors remaining unchanged. Once in the 
surface layer, resupplied O; and Hg(0) from aloft can participate in 
renewed depletion chemistry, as the sea-ice leads occur at a time of active 
depletion chemistry, possibly increasing deposition loads attributed to 
Hg depletion events””’ and the total amount of O3 destroyed in the 
atmosphere. It is possible that a warming environment and changes in 
sea-ice cover produce other changes in Arctic chemistry that may 
ultimately affect the quantity of Hg accumulating in biota (for example, 
shorter duration of sea-ice cover may cause increased photochemical 
reduction and photo-degradation of methyl mercury”’). As seasonal 
sea ice increases at the expense of perennial sea ice and lead activity 
is expected to increase’, large areas across the Arctic may experience 
increased convective replenishment of Hg(0) and O3, affecting Hg 
oxidation and O, depletion chemistry, as observed in Barrow. Lead- 
induced shallow convective mixing of air in response to sea-ice leads— 
as shown for Hg(0) and O3;—could also affect the boundary-layer input 
of other pollutants—such as persistent organics, aerosols and other 
heavy metals**°. 


METHODS SUMMARY 

Ground measurements. Ground-based measurements during BROMEX included 
characterization of speciated atmospheric Hg (including Hg(0), Hg(ID) gaseous: and 
Hg(ID) particulate) and O3 on the frozen Chukchi Sea (2 km off the coast) and on the 
frozen tundra (5 km inland), but only data from the sea ice were used for this study. 
During the OASIS campaign, atmospheric Hg speciation was measured at three 
different locations over the frozen Arctic Ocean". All the data presented passed 
strict quality assurance and control protocols, and the Hg(0) concentrations pre- 
sented are hourly averages. To be consistent, all O, concentrations used for both 
years were from the NOAA-operated Barrow Observatory. Correlation analysis showed 
excellent agreement of data measured at all stations. 

Satellite images and HYSPLIT modelling. Daily, densely gridded (250 m) MODIS 
images of ice conditions were composed of the 7-2-1 bands (2,105-2,155 nm, 841- 
876 nm and 620-670 nm wavelengths) from the NASA Terra satellite. These were 
combined with high-temporal-resolution NOAA HYSPLIT air-mass trajectories 
modelled 24 hours backwards in time and generated every 4hours based on 
meteorology data from the Global Data Assimilation System (GDAS) interpolated 
from a 1.0° by 1.0° grid in latitude and longitude. HYSPLIT model runs were 
performed at 25 m, 225 m and 400 m. All back-trajectories presented are at 25m 
and, owing to atmospheric stability, were verified at the heights of 225m and 
400 m. HYSPLIT trajectories generated with GDAS meteorological data were also 
checked with back-trajectories generated from Weather Research and Forecasting 
Model meteorological data to verify their paths. Daily satellite images were used for 
both years to map sea-ice conditions precisely over several hundred kilometres 
around our measurement domain—including solid sea ice, open sea-ice leads and 
refreezing of previously open leads. The images were overlaid with the HYSPLIT 
back-trajectories to assess how O3 depletion events and atmospheric Hg depletion 
events measured on the ground near Barrow related to the upwind footprint area of 
measured air masses. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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Pathogens and insect herbivores drive rainforest 
plant diversity and composition 


Robert Bagchi’, Rachel E. Gallery'’, Sofia Gripenberg'*, Sarah J. Gurr®®, Lakshmi Narayan’, Claire E. Addis’, 


Robert P. Freckleton’ & Owen T. Lewis! 


Tropical forests are important reservoirs of biodiversity’, but the 
processes that maintain this diversity remain poorly understood’. 
The Janzen-Connell hypothesis** suggests that specialized natural 
enemies such as insect herbivores and fungal pathogens maintain 
high diversity by elevating mortality when plant species occur at high 
density (negative density dependence; NDD). NDD has been detected 
widely in tropical forests*°, but the prediction that NDD caused by 
insects and pathogens has a community-wide role in maintaining 
tropical plant diversity remains untested. We show experimentally 
that changes in plant diversity and species composition are caused 
by fungal pathogens and insect herbivores. Effective plant species 
richness increased across the seed-to-seedling transition, corres- 
ponding to large changes in species composition’. Treating seeds 
and young seedlings with fungicides significantly reduced the diver- 
sity of the seedling assemblage, consistent with the Janzen-Connell 
hypothesis. Although suppressing insect herbivores using insecti- 
cides did not alter species diversity, it greatly increased seedling recruit- 
ment and caused a marked shift in seedling species composition. 
Overall, seedling recruitment was significantly reduced at high con- 
specific seed densities and this NDD was greatest for the species that 
were most abundant as seeds. Suppressing fungi reduced the nega- 
tive effects of density on recruitment, confirming that the diversity- 
enhancing effect of fungi is mediated by NDD. Our study provides 
an overall test of the Janzen—Connell hypothesis and demonstrates 
the crucial role that insects and pathogens have both in structuring trop- 
ical plant communities and in maintaining their remarkable diversity. 
Understanding the mechanisms that allow species to coexist in nat- 
ural ecosystems is one of the most enduring questions in community 
ecology. The key challenge is to identify how competitive exclusion is 
prevented, particularly in situations where large numbers of species share 
similar resource requirements’. This question has special relevance to 
tropical forest plant communities, which can have exceptional species 
richness”’. The rapid degradation and destruction of tropical forests!” 
and the large impact this may have on global biodiversity’, carbon and 
water cycling and climate feedbacks'* makes understanding the mech- 
anisms maintaining and structuring their diversity imperative. 
There is compelling evidence that natural enemies, including insect 
herbivores and fungal and oomycete pathogens (hereafter referred to col- 
lectively as pathogens), regulate many plant populations in the tropics”’*”" 
and elsewhere’®!”. Transmission of natural enemies is more effective 
between plants growing in areas of high conspecific density, reducing 
plant survival. The Janzen—Connell hypothesis suggests that this nega- 
tive density dependence (NDD) will promote plant community diver- 
sity by preventing dominant species from competitively excluding other 
species**. This hypothesis is one of the most widely invoked explanations 
for species coexistence, and ultimately high diversity, in plant communities. 
Although numerous studies have revealed NDD in plant communities**"’, 
there is considerably less empirical support for the contention that this 


will translate into enhanced community diversity’. A key study’ in 
Panama documented NDD at the seed-to-seedling transition in a suite 
of 53 species, and linked this NDD to increased community diversity. 
However, the causes of this NDD were not identified. Although reduced 
herbivory by vertebrates can alter the composition of tropical plant 
communities'?-*’, such effects rarely show NDD, and insect herbivores 
and pathogens are widely regarded as the most likely causes of NDD 
leading to enhanced plant diversity*’”’. Despite this, studies demon- 
strating a causal link between insect- and pathogen-mediated NDD and 
plant community diversity are lacking. 

We compared community diversity of seeds and recruiting seedlings 
in a tropical forest in Belize, Central America, and investigated whether 
experimentally excluding natural enemies decreased plant diversity, 
as predicted by the Janzen-Connell hypothesis. The effective number 
of species (inverse Simpson’s dominance index, 1/D) among seedlings 
recruiting in unmanipulated (control) plots was significantly higher 
than among seeds falling in adjacent seedfall traps (Alog(1/D) = 0.69 
+ s.e.m. = 0.058, tio7 = 11.91, P< 0.001), corresponding to a doubling 
of the effective number of plant species at the seed-to-seedling trans- 
ition. To determine whether insect herbivores or pathogens could be 
contributing to this increase in diversity we compared the diversity of 
seedlings growing in control plots (sprayed weekly with water) to plots 
where we suppressed either insects by spraying an insecticide (Engeo), 
or pathogens by spraying one of two fungicides, Amistar or Ridomil. 
Each of the pesticide treatments reduced species diversity, but the effects 
were only statistically significant for the fungicide Amistar (Fig. 1a; 
tios = —2.45, P = 0.016), which reduced the effective number of spe- 
cies by approximately 16%. This result clearly implicates pathogenic 
fungi in promoting seedling diversity. 

Two other changes in the plant community at the seed-to-seedling 
transition were evident: a shift in species abundances, and altered species 
composition. These trends were also affected by pesticide treatments. 
Insecticide treatment increased the total number of recruiting seed- 
lings by a factor of 2.7 compared to the control (Fig. 1b; t195 = —7.67, 
P< 0.001), demonstrating that plant-feeding insects are a major cause 
of mortality at this life stage. Although Amistar enhanced seedling recruit- 
ment, this effect was marginally nonsignificant (t,95 = 1.81, P = 0.074). 
Dissimilarity in species composition between the seeds and seedlings, 
measured using the abundance-weighted Morisita-Horn index (R;,)”’, 
was approximately 87% in the control plots (Fig. 1c). Treating seedlings 
with insecticide dramatically and significantly reduced this dissimilarity 
(tios = —7.86, P< 0.001). The fungicides (Amistar and Ridomil) did 
not reduce the dissimilarity to seeds significantly, but nevertheless the 
dissimilarity between the species compositions of the fungicide-treated 
plots and the control plots was about 20% (Extended Data Fig. 1). Over- 
all, our results suggest that insects disproportionately kill certain plant 
species, reducing their abundances during the transition from seeds to 
seedlings. Insects thus strongly influence the structure of plant communities 
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Figure 1 | Suppression of insects and pathogens alters seedling community 
composition and diversity, respectively. a—c, Effects of insecticide (Engeo) 
and two fungicides (Amistar and Ridomil) on the mean effective number of 
species recruiting as seedlings (a); the mean seedling abundance summed 


in this forest; however, by doing so relatively independently of plant 
density, their net effect on plant species diversity is small. 

For 18 species, sufficient data were available to conduct a formal test 
for NDD (see Methods). The slope of the relationship between the log 
number of seeds and the log number of seedlings in the control plots 
was less than 1 in 13 of the 18 species, and significantly less than 1 for 3 
of these (Fig. 2a and Supplementary Table 1), indicating NDD*. Fur- 
thermore, the mean slope across species was significantly less than 1 in the 
control treatment (t43 = —2.68, P = 0.010), suggesting that NDD is wide- 
spread, as has been found in previous studies of the seed-to-seedling 
transition in tropical forests’. Suppressing fungi using the fungicide 
Amistar reduced the strength of NDD so that the mean slope was 
no longer significantly different from 1 (Fig. 2b; t4g = —1.54, P = 0.130). 
Neither Ridomil nor Engeo reduced the strength of NDD, with the mean 
slope remaining significantly less than 1 in both treatments (Fig. 2b). 
Thus, the significant effects of fungal pathogen exclusion on seedling 
diversity shown in Fig. 1 can be causally linked to a reduction in the 
magnitude of NDD. 

The strength of NDD in control plots was greatest in the species that 
were most abundant as seeds (Fig. 3a; t)6 = —4.33, P = 0.001; Extended 
Data Table 1). Greater NDD might be detected in these species because 
their high densities facilitate transmission of insects and pathogens, or 
because pests and diseases adapt to exploit the most abundant resources. 
The positive relationship between NDD and seed abundances contrasts 
with the results of two recent studies investigating NDD in relation to 
adult abundances'*™. These differing results may arise because the rank 
abundances of species can shift substantially between seeds and adults, 
and because seed abundance reflects fecundity (which will be inversely 
correlated with seed size) as well as adult abundance. A third possibility 
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Figure 2 | Recruitment across the seed-to-seedling transition showed NDD 
in the control, but spraying with the fungicide Amistar removed this NDD. 
a, The relationship between number of recruits and number of seeds for one 
example species, Terminalia amazonia. Without NDD, the expected slope is 1 
ona log-log scale (dotted line) and the y intercept represents recruitment at low 
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across all species (b); and the mean abundance-weighted Morisita-Horn 
dissimilarity in species composition for seedlings under each treatment 
compared to seeds in adjacent seed traps (c). The error bars represent the 95% 
confidence intervals of the mean across the 36 stations. 


is that abundant seed production is correlated with other traits (for 
example, lower defence investment or shade tolerance’’), which are asso- 
ciated in turn with greater susceptibility to density-responsive natural 
enemies. Regardless of the cause, by reducing the survival of common 
species disproportionately, NDD may have increased the diversity of 
recruits more than expected from the average NDD effect. All pesticide 
treatments weakened the relationship between NDD and abundance 
markedly (Fig. 3b—d and Extended Data Table 1). By weakening NDD, 
especially in species that are abundant as seeds, fungicide application 
may have removed one mechanism for enhancing diversity at the seed- 
to-seedling transition, leading to the significantly lower seedling diver- 
sity observed in the Amistar treatment. 

As a final evaluation of the contribution of NDD to enhancing the 
diversity of recruiting seedlings, we used a simulation approach. Changes 
in community diversity and composition across the seed-to-seedling 
transition and following the exclusion of natural enemies could result 
from either NDD or trade-offs between seed production and allocation 
to defence against insects and pathogens. To distinguish these two 
possibilities we used models fitted to the 18 most abundant species 
to simulate communities in two scenarios (see Methods). In the first 
scenario (low-density survival), per-capita recruitment was independ- 
ent of seed density and equal to that expected under each treatment in 
the absence of conspecific neighbours. In the second scenario (NDD 
survival), per-capita recruitment was dependent on both pesticide 
treatment and seed density. Simulations in the low-density survival 
scenario greatly underestimated the effective number of species in the 
control plots (Fig. 4a). Total seedling abundance was also over- 
estimated (Fig. 4b) and dissimilarity in species composition under- 
estimated (Fig. 4c) in the control and insecticide treatments in the 
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density. The observed slope was lowest (and <1) in the control; treatment with 
fungicides but not insecticides increased the slope. b, The NDD effect is 
significantly <1 across 18 species in the control treatment, indicating prevalent 
NDD. Spraying with Amistar, but not other pesticides, removed this effect. 
Error bars show 95% confidence intervals of the mean. 
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Figure 3 | Negative density dependence is strongest in species that are most 
abundant as seeds. The relationship between seed abundance and the strength 
of NDD is shown for the 18 species analysed. a-d, The relationship for the 
control plots sprayed with water (a), plots sprayed with the insecticide 

Engeo (b), plots sprayed with the fungicide Amistar (c), and plots sprayed with an 
alternative fungicide, Ridomil (d). The bold lines are the relationships fitted with 
a weighted linear mixed-effects model (weights are the inverse of the standard 
deviations, which are indicated by error bars), with the 95% confidence intervals 
of the mean. The dotted line shows the null expectation of regression coefficients 
of 1 in the absence of an effect of seed abundance on the strength of NDD. 


absence of NDD. Similar results were obtained using an alternative 
simulation scenario, where per-capita recruitment reflected that 
recorded at the mean seed density for each species’ (Extended Data 
Fig. 2). Adding NDD to the simulations replicated the observed data 
better. Overall, these simulations confirm that pathogen-mediated 
NDD is responsible for increasing the diversity of seedlings. 
Although individual components of the Janzen—Connell hypothesis 
have been tested repeatedly since its formulation more than 40 years 
ago’, experimental tests of the key overall hypothesis that natural ene- 
mies cause NDD and thus promote species coexistence and enhance 
species diversity are rare. One such study’ found no evidence that exclud- 
ing vertebrate herbivores reduced NDD or diversity. However, although 
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Figure 4 | Including NDD in model simulations reproduces the observed 
diversity patterns, whereas excluding NDD underestimates diversity in the 
control and insecticide treatments. Observed diversity (a), total abundance 
(b) and dissimilarity in species composition to the seeds (c) in each treatment 
were compared with those simulated either assuming a constant survival for 
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vertebrates have occasionally been implicated as drivers of NDD**”, 
the primary causes of NDD are thought to be insects and pathogens*°””. 
The results presented here build on existing evidence for widespread 
NDD in tropical plant communities***"*~ by establishing the cause of 
NDD, and by linking it to increased plant species diversity, as suggested 
by the Janzen—Connell hypothesis. 

Our experiments highlight that both insect herbivores and pathogens 
help structure tropical plant communities at the early stages of com- 
munity assembly and provide support for a pivotal role for natural- 
enemy-mediated NDD in maintaining species diversity in this tropical 
forest. Although the magnitude of the NDD we observed was relatively 
small, this study was conducted over a relatively short timescale (17 
months) in a tropical forest of relatively low plant species diversity 
(approximately 320 tree species have been recorded in the reserve”). 
The effects of NDD will probably accumulate over time, and may be 
stronger in more species-rich forests. Indeed, similar experiments in 
other forests are now needed to evaluate the generality of the Janzen- 
Connell hypothesis as an explanation for variation in species diversity 
among tropical plant communities. 


METHODS SUMMARY 


We established 36 sampling stations within a 1-hectare (ha) area in the Chiquibul 
Forest Reserve, Belize. Each station had three seed traps and four seedling plots. Plots 
were randomly assigned to four treatments: control (sprayed with water), insecticide 
(Engeo), or one of two fungicides (Amistar or Ridomil), applied weekly for 17 months. 
We recorded numbers of seeds from each species collected weekly in each trap. 
Number and identities of seedlings germinating in each plot were recorded monthly 
during the peak recruitment period (April to August) and every 2 to 4 months other- 
wise. We compared the total number of individuals and their diversity (inverse 
Simpson’s dominance index, 1/D) between seeds and seedlings in the control plot 
and among pesticide treatments using mixed-effects models. We also compared 
dissimilarity in species composition between seeds and seedlings (abundance- 
weighted Morisita~Horn dissimilarity index) among pesticide treatments. In the 
absence of NDD, a slope of 1 is expected for the relationship between log number 
of seeds and log number of seedlings’. We estimated this slope and the effect of 
pesticide treatments for 18 species. To determine the average effect of density and 
the effect of overall species abundance, we modelled the slopes (see Methods) of 
each species and pesticide treatment combination as a function of log total seed 
abundance and pesticide treatment, using a weighted mixed-effects model. Finally, 
to determine whether NDD could generate observed differences in communities 
among treatments, we used the models of the 18 species to simulate communities, 
assuming survival to be either density dependent or density independent, based on 
the establishment probability expected in the absence of conspecific neighbours. 
We calculated abundance, diversity and dissimilarity based on 1,000 simulations 
for each scenario and compared the mean and 95% confidence intervals of the 
observed data to those derived from the simulations. Data (Supplementary Data 1) 
and code for analyses (Supplementary Notes 1) are provided as Supplementary 
Information. 
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METHODS 

Field survey. Our field site was close to the Las Cuevas Research Station in south- 
west Belize (16° 43’ 53’’ N, 88° 59’ 11’’ W) at 450 m elevation within the 170,000-ha 
Chiquibul Forest Reserve protected area. This site has limestone geology and a 
relatively intact flora and vertebrate fauna. It experiences a marked dry season, 
typically from February to May, with annual rainfall approximately 1,500 to 1,800 
mm (ref. 30). We established 36 sampling stations on the forest floor, positioned at 
20-m intervals on a 120 m X 120 m grid. Each station comprised seven 1-m* quad- 
rats, placed as close together as possible while avoiding trees and large rocks. Three 
of the quadrats at each station were randomly selected as locations for 1-m* seed 
traps made from 1-mm mesh fibreglass netting, suspended 80 cm above the ground 
using PVC poles. The remaining quadrats were cleared of existing seedlings and 
assigned at random to one of three enemy exclusion treatments or to a control 
treatment. Two fungicide treatments were used: Amistar (Syngenta Ltd; active 
ingredient, azoxystrobin), which has broad-spectrum systemic activity against a 
range of plant pathogenic fungi, and Ridomil Gold MZ 68WP (Syngenta Ltd; active 
ingredients, mancozeb and metalxyl), which protects plants from infection by oomy- 
cetes and fungi. The insecticide used was Engeo (Syngenta Ltd; active ingredient, 
thiamethoxam), which provides both systemic and contact protection against a 
range of insects. Pesticides were applied weekly with a hand mister, following the 
manufacturers’ guidelines (0.005 g of Amistar, 0.25 g of Ridomil Gold or 0.0025 ml 
of Engeo, in each case dissolved in 50 ml of water). Control plots were sprayed with 
50 ml of water at the same time as pesticide applications. Treatments began in July 
2007 except for the Engeo application, which began in April 2008. All treatments 
were applied weekly until September 2009. Only data from April 2008 onwards 
(during which all treatments were applied) were used in the analyses presented here. 

Seeds were collected weekly from the traps; damaged or inviable seeds (partially 
eaten or immature) were discarded and the remaining seeds were counted and 
identified to species level, where possible, or as morphospecies. A subset of the seeds 
from each species and morphospecies were placed on moist tissue paper in seed 
germination trays. We photographed examples of all seed and seedling morpho- 
species to match seeds to seedlings in cases where species identification was not 
possible. This ensured consistent classification throughout the experiment, and 
facilitated subsequent plant identification. In this way we matched 97% of the 
individual seeds collected in our study to seedlings. 

We censused the seedling plots for new seedlings every month during the peak 
period of fruiting and recruitment (April to August) and less frequently (every 2 to 
4 months) during the rest of the year. At each census, all new seedlings were tagged 
and identified with species or morphospecies. Unidentified seedlings were photo- 
graphed. By comparing these photographs to seedlings germinated from collected 
seeds we were able to match 90% of the observed seedlings to seeds. 

To confirm that the significant effects of insecticide treatment were a consequence 
of reduced attack from insects rather than a direct effect of Engeo on plant survival*’, 
we set up experiments in May 2010 in which a subset of the focal plant species 
(Stemmadenia donnell-smithii (n = 60), Cordia alliodora (n = 80), Cryosophila 
stauracantha (n = 70), Combretum laxum/fruticosum (n = 70), Terminalia ama- 
zonia (n = 100) and Forsteronia sp. (n = 70)) were grown from seed at high density 
under insect-free conditions, and with Engeo treatment. Freshly collected seeds 
were sown into 60 seed trays (36 cm length x 24 cm width X 5 cm depth) filled with 
locally collected soil that had been sorted to remove large stones and roots. Each tray 
was divided into six sections, with seeds of each species sown into one section. 
Trays were enclosed in a bag made from insect-proof nylon netting to exclude 
insects. The netting was raised above the surface of the tray to allow seedlings to 
grow. For the shadehouse experiment, 30 trays were placed in randomly allocated 
positions on raised benches in a small forest gap, covered with waterproof shade 
netting. For the field experiment 30 trays were placed on the forest floor in a 
randomized grid design, spaced by 200 cm. Trays in the shadehouse were watered 
regularly (approximately every 2 to 3 days). Half of the trays (chosen at random) 
in both experiments were sprayed weekly with 0.0025 ml m7 of Engeo using a 
hand mister. The remaining trays were sprayed with an equal volume of water. 
Germinating and surviving seedlings were censused after 8 weeks (shadehouse 
experiment) or 7 weeks (field experiment). We analysed the number of seedlings 
at the end of the experiment as a function of insecticide treatment using general- 
ized linear models for each species, assuming a negative binomial distribution for 
the errors. No significant (P < 0.05) effects of Engeo on survivorship were docu- 
mented in any species in either experiment (see Extended Data Table 2). 
Analysis. We calculated the total number of seeds or seedlings observed in each 
seedling plot (N) and the reciprocal of the Simpson’s dominance index (1/D, 
D= >~ pz, where p, is the proportional abundance of species k in a community 


ae ee ; : ; 
with s species) as a measure of the effective number of species*. We quantified 
differences in species composition among treatments by calculating the Morisita- 
Horn index of dissimilarity (R,)”? 
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between the seed traps and all the seedling plots, i, at each station, j. We calculated 
the pairwise dissimilarity between each plot and each trap, and then took the 
mean for the three traps at each station. Results were qualitatively unchanged us- 
ing other diversity and dissimilarity metrics (for example, Shannon’s diversity 
index and the Bray—Curtis index of dissimilarity; see Extended Data Tables 3 and 
4). All diversity and dissimilarity indices were calculated using the ‘vegan’ package”’ 
in R v3.0.1 (ref. 34). We compared these metrics between control plots and seed traps 
and among pesticide treatments (control, insect exclusion with Engeo, true fungi 
exclusion with Amistar or oomycete and true fungi exclusion with Ridomil), using 
linear mixed-effects models (fitted using the ‘nlme’ package” in R v3.0.1 (ref. 34)) 
with different intercepts for the stations included as a normally distributed random 
effect. We assumed a Gaussian error distribution for the models of N (log-trans- 
formed), 1/D (log-transformed) and dissimilarity (logit-transformed). There was 
evidence of heteroscedasticity in the residuals of the models of 1/D and R;, so this 
was accounted for by explicitly modelling the variance as a function of pesticide 
treatment (for 1/D) or as an exponential function of the expected values (for R;,). 

We used these models to test three hypotheses: first, diversity is greater among 
seedlings than among seeds; second, excluding natural enemies with pesticides 
decreases diversity; and, third, excluding natural enemies with pesticides alters spe- 
cies composition. 

For a subset of species we examined the effects of pesticides, seed density and 
their interaction on seedling recruitment at the level of individual species. For this 
analysis, we selected all 18 species that met two criteria: seeds and/or seedlings of 
these species were recorded at =5 stations (the sets of stations with seeds and 
seedlings did not have to overlap); and mean seed density varied at least threefold 
among stations. The species that met these criteria are listed in Supplementary 
Table 1. The relationship between the number of seeds in plot i at station j, Noi, 
and the expected number of recruits, Nj ;; can be described by the equation® 


Ng= exp(a)No i (2) 


where exp(«) is the ratio of seedlings to seeds at low density (No,j = 1). The para- 
meter f is 1 if survival is independent of conspecific density and less than 1 when 
this ratio is reduced at high density (that is, NDD). Because we did not measure the 
seed rain in the seedling plots at each station j directly, No,; has to be estimated 
from the adjacent seed traps at station j instead. Ignoring the error in these esti- 
mates of N, ; biases the estimation of £ towards 0 (ref. 36), and therefore over- 
estimates the importance of NDD. To overcome this potential bias, we modelled 
No and N, jointly as: 


Noi =NegBin(/;,ico); A; ~ lognorm(A,o7) 


N, LiL = NegBin(exp (orb KL) (3) 


where both Nojj (the number of seeds in plot i at station j) and Nii (the number of 
recruits in plot i at station j) were drawn from negative binomial distributions 
defined by the expected number of individuals and stage (t = 0, 1) and treatment 
(L) specific size or overdispersion parameters, K,,;. The expected number of seeds 
in the plots at station j is 4; and the 4; were drawn from a lognormal distribution 
with mean / and variance o”. The number of seeds falling in the seedling plots is 
treated as missing data which need to be imputed from the seed trap data collected 
at the same station. The parameters «, and f, correspond to the low-density sur- 
vival rate and effect of density on recruitment under treatment L as described by 
equation (2). This hierarchical model was fitted using the INLA package’ in R 
v3.0.1 (ref. 34). 

We used estimates of 3; from these models to test two hypotheses for each 
species: first, survival is negatively density dependent (control < 1); second, natural 
enemies cause the observed negative density dependence, so that applying pesticides 
weakens the relationship between seed density and survival (Bcontrot < Bpesticide)- 

We tested whether the estimates of f across the 18 species were significantly 
different from 1 in the control treatment and whether they varied among pesticide 
treatments and with the logarithm of the seed abundance (No,x) of each species, k. 
This was achieved by fitting a linear mixed-effects model to estimates of with 
species included as a random effect. The contribution of each estimate of f to the 
model was weighted by «,,;, the inverse of its standard deviation. The model can 
be described as 
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b, ~Norm(0,;); ext ~ Norm(0,02) (4) 


where the y represent the estimated fixed effects parameters, b, is the random 
effect for species k and ¢,, is the error under treatment L for species k. The 
parameters 7, (pesticide effect on mean NDD) and y3, (pesticide effect on the 
relationship between overall seed abundance and strength of NDD) are zero for 
the control treatment and represent the change in these parameters under each 
pesticide treatment compared to the control. 

Finally, we tested whether the estimated effects of NDD and pesticides on recruit- 
ment of individual species could account for the observed differences in diversity 
among pesticide treatments. We used the parameters estimated in the 18 species- 
specific models to simulate new communities in three scenarios. For each species, 
the number of seeds in each trap was drawn from a negative binomial distribution 
with mean 4 and size Ko. In the ‘NDD survival’ scenario, the number of seedlings 
in plots at station j with treatment L was drawn from a negative binomial distri- 
bution with mean = exp(a, iA “and size = k,,,. The ‘low-density survival’ scen- 
ario assumed that survival was independent of seed density and was equal to the 
survival for each species within each pesticide treatment when density was 1 (that 
is, when seedlings had no conspecific neighbours). The number of seedlings was 
therefore drawn from a negative binomial distribution with mean = exp(,,1)4). 
We then calculated the effective number of species for the simulated communities 
at each station and treatment combination and extracted the means for each treat- 
ment. This procedure was repeated 1,000 times and the median and 95% quantiles 


across the simulations were extracted in each scenario. A similar procedure was used 
to simulate communities expected in a third scenario, consistent with previous 
studies’, where seed-to-seedling transition probabilities reflected those recorded at 
the mean seed density for each species. This was achieved by refitting the model to 
each species after fixing the values of all the f,, to 1 and using this model for the 
simulations. We then calculated the mean total abundance, effective number of 
species and dissimilarity to species composition of the seeds in each treatment using 
the observed data for the 18 species. We compared these observed data to the 
simulations in the low-density and NDD scenarios (main text) and the mean- 
density and NDD scenarios (Extended Data Fig. 2). 
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Extended Data Figure 1 | The mean abundance-weighted Morisita-Horn 
dissimilarity in species composition (and 95% confidence intervals), 
comparing seedlings recruiting in the control plots with seedlings in the 
pesticide treatments and with seeds falling into seed traps. 
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Extended Data Figure 2 | A comparison of the observed seedling The simulated values are means and 95% confidence intervals based on 1,000 


communities (observed survival) with those simulated either fixing survival simulations for effective number of species, total abundance and community 
to the mean for each species in each treatment (mean density survival) or _ dissimilarity to seeds falling in adjacent traps. 


allowing survival to be negatively density dependent (NDD survival). 


©2014 Macmillan Publishers Limited. All rights reserved 


Extended Data Table 1 | Coefficients from the model relating the 
strength of NDD to treatment, log total seed abundance, and their 


interaction 

Term Parameter 
a (Water) 0.12 
nsecticide effect (Engeo) 

(v1, Engeo) 0.00 
Fungicide effect (Amistar) 

V1, Amistar) 0.05 
Fungicide effect (Ridomil) 

V1, Ridomil) -0.02 
log No (standardised) (v2) 0.37 
Engeo:log No (standardised) 

(v3, Engeo) 0.23 
Amistar:log No 

(standardised) (V3, amistar) 0.25 
Ridomil:log No 

(standardised) (V3, Ridomi) 0.28 
Among species variance 0.012 
(06?) 

Residual variance (0:2) 0.469 


Std. Error df. 


0.044 


0.047 


0.048 


0.049 


0.086 


0.085 


0.096 


0.095 


48 


48 


48 


48 


16 


48 


48 


48 


1.02 


-0.34 


4.33 


2.65 


2.60 


3.00 


0.312 


0.732 


0,001 


0.010 


0.012 


0.004 


The strength of NDD was measured as the coefficient for seed density, 8, from equation (3). The log total 
seed abundance was standardized across species. The model was fitted as a mixed-effects model with 


the contribution of each value of 8 weighted by the inverse of its standard deviation. 
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Extended Data Table 2 | Coefficients from the negative binomial model fitted to the shadehouse and field trials of effects of the insecticide 
Engeo on seedling survival 


Shadehouse Field 
Coefficient SE Z P Coefficient SE Z P 
‘Sisnimadonie intercept 0.57 = 0,044 = -12.858 = <0.001 0.67 0.128 6.220 —<0.001 
EngeoTreatment 0.04 0,062 0.621 0.535 0.05 0,181 -0.282 0.778 
Cordia intercept 0.43 0.086 6.053 -<0.001 0.35 (0.064 -5.495 <0.001 
EngeoTreatment -0.07 0.121 -0.592 0.554 -0.06 0.090 -0.633 0.527 
Cryosophila intercept -0.33 0.036 8.941 <0.001 -0.61 0.042 — -14.366 <0.001 
EngeoTreatment -0.01 0.051 -0.154 0.877 -0.03 0.060 -0.531 0.595 
Combretum intercept 0.67 0,043. -15.452 ss <0.001 3.11 0.332 9.364 — <0.001 
EngeoTreatment 0.02 0.061 0.243 0.808 0.83 0.456 1.824 0.068 
Terminalia intercept 483 0.289 -16.725 —<0.001 $62 0.743 8.908 — <0.001 
EngeoTreatment 0.09 0.417 -0.208 0.835 1.10 0,878 1.251 0.211 
Forsteronia intercept -0.67 0.043 -15.567 <0.001 -1.12 0.108 -10.360 <0.001 
EngeoTreatment 0.01 0.061 0.152 0.879 0.00 0.152 0.000 1.000 


Note that Engeo did not have a significant effect on survival in any of the species tested (shaded rows). 
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Extended Data Table 3 | Tests of pesticide effects on seedling spe- 
cies diversity using different diversity indices 


Treatment Parameter Std.Error d. f. t P 
Log Inverse Simpson's 


Diversity Index (1/D) 


Intercept (Water) 1.30 0.059 105 22.04 0.000 
Insecticide (Engeo) -0.06 0.077 105 -0.83 0.410 
Fungicide (Amistar) -0.18 0.072 105 -2.45 0.016 
Fungicide (Ridomil) -0.08 0.072 105 -1.10 0.274 


Log Shannon's Diversity 


Index (H) 


Intercept (Water) 1.56 0.053 105 29.30 0.000 
Insecticide (Engeo) 0.03 0.063 105 0.54 0.591 
Fungicide (Amistar) -0.15 0.067 105 -2.29 0.024 
Fungicide (Ridomil) -0.07 0.067 105 -1.05 0.298 


Log Fisher's alpha (a) 


Intercept (Water!) 1.33 0.084 104. 15.92 0.000 
Insecticide (Engeo) -0.13 0.086 104 -1.57 0.120 
Fungicide (Amistar) -0.18 0.095 104 -1.88 0.063 
Fungicide (Ridomil) -0.04 0.126 104 -0.33 0.744 


Rarefied Species 


Richness 

Intercept (Water!) 1.16 0.033 105 35.38 0.000 
Insecticide (Engeo) -0.06 0.037 105 -1.56 0.122 
Fungicide (Amistar) -0.11 0.043 105 -2,55 0.012 
Fungicide (Ridomil) -0.03 0.041 105 077 (0.446 


The indices were calculated using the vegan v2.0833 package in R v3.0.134. Rarefaction was based on 
samples of five individuals. The table presents parameters of linear mixed-effects models fitted to these 
metrics as functions of treatment. The first parameter (intercept) represents mean diversity in the 
control treatment and the following parameters represent the difference between the control treatment 
and the three pesticide treatments. 
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Extended Data Table 4 | Tests of pesticide effects on dissimilarity in 
species composition, comparing assemblages of seedlings germin- 
ating in plots to those of seeds falling in adjacent seed traps 


Treatment Parameter Std.Error d. f. t P 
Logit Morisita- 
Horn Dissimilarity 
Index 
Intercept (Water) 1.72 0.240 105 7.18 0.000 
Insecticide (Engeo) -1.91 0.243 105 -7.86 0.000 
Fungicide (Amistar) -0.03 0.270 105 -0.12 0.904 
Fungicide (Ridomil) 0.12 0.272 105 0.44 0.664 
Binomial 
Dissimilarity 
Index 
Intercept (Water) 8.251 0.205 105 40.208 0.000 
Insecticide (Engeo) -0.965 0.183 105 -5.278 0.000 
Fungicide (Amistar) 0.070 0.205 105 0.342 0.733 
Fungicide (Ridomil) 0.013 0.203 105 0.063 0.950 


Logit Bray-Curtis 


Dissimilarity 

Index 

Intercept (Water) 3.123 0.175 105 17.896 0.000 
Insecticide (Engeo) -1.412 0.110 105 -12.889 0.000 
Fungicide (Amistar) -0.144 0.110 105 -1.312 0192 
Fungicide (Ridomil) -0.008 0.110 105 -0.075 0.940 


Logit Jaccard 


Dissimilarity 

Index 

Intercept (Water) 3.814 0.175 105 21.757 0.000 
Insecticide (Engeo) -1.417 0.110 105 -12.847 0.000 
Fungicide (Amistar) -0.148 0.110 105 -1.344 0.182 
Fungicide (Ridomil) -0.008 0.110 105 -0.075 0.940 


Four alternative metrics of dissimilarity were calculated using the vegan 2.08 package in R v3.0.1 
(ref. 33). The table presents the fixed-effects parameters of linear mixed-effects models used to 
describe these metrics as functions of treatment. The intercept represents the mean dissimilarity 
between seed and control plot assemblages, and the other three parameters indicate the difference 
between seed-control plot dissimilarity and seed—pesticide plot dissimilarity. 
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Early flowering plants are thought to have been woody species 
restricted to warm habitats’. This lineage has since radiated into 
almost every climate, with manifold growth forms*. As angiosperms 
spread and climate changed, they evolved mechanisms to cope with 
episodic freezing. To explore the evolution of traits underpinning 
the ability to persist in freezing conditions, we assembled a large 
species-level database of growth habit (woody or herbaceous; 49,064 
species), as well as leaf phenology (evergreen or deciduous), diameter 
of hydraulic conduits (that is, xylem vessels and tracheids) and climate 
occupancies (exposure to freezing). To model the evolution of spe- 
cies’ traits and climate occupancies, we combined these data with an 
unparalleled dated molecular phylogeny (32,223 species) for land 
plants. Here we show that woody clades successfully moved into freezing- 
prone environments by either possessing transport networks of small 
safe conduits’ and/or shutting down hydraulic function by dropping 
leaves during freezing. Herbaceous species largely avoided freezing 
periods by senescing cheaply constructed aboveground tissue. Growth 
habit has long been considered labile®, but we find that growth habit 
was less labile than climate occupancy. Additionally, freezing envir- 
onments were largely filled by lineages that had already become herbs 
or, when remaining woody, already had small conduits (that is, the 
trait evolved before the climate occupancy). By contrast, most decidu- 
ous woody lineages had an evolutionary shift to seasonally shedding 
their leaves only after exposure to freezing (that is, the climate occu- 
pancy evolved before the trait). For angiosperms to inhabit novel 
cold environments they had to gain new structural and functional 
trait solutions; our results suggest that many of these solutions were 
probably acquired before their foray into the cold. 

Flowering plants (angiosperms) today grow in a vast range of envir- 
onmental conditions, with this breadth probably related to their diverse 
morphology and physiology’. However, early angiosperms are gen- 
erally thought to have been woody and restricted to warm understory 
habitats'°. Debate continues about these assertions, in part because of 
the paucity of fossils and uncertainty in reconstructing habits for these 
first representatives* ''. Nevertheless, greater mechanical strength of 
woody tissue would have made extended lifespans possible at a height 
necessary to compete for light'*"*. A major challenge resulting from 
increased stature is that hydraulic systems must deliver water at tension 


6,24 


to greater heights: as path lengths increase so too does resistance’. 
Among extant strategies, the most efficient method of water delivery 
is through large-diameter water-conducting conduits (that is, vessels 
and tracheids) within xylem’. 

Early in angiosperm evolution they probably evolved larger conduits 
for water transport, especially compared with their gymnosperm cousins". 
Although efficient in delivering water, these larger cells would have 
impeded angiosperm colonization of regions characterized by episodic 
freezing'*'°, as the propensity for freezing-induced embolisms (air bub- 
bles produced during freeze/thaw events that block hydraulic pathways) 


jiidae 


Figure 1 | Time-calibrated maximum-likelihood estimate of the molecular 
phylogeny for 31,749 species of seed plants. The four major angiosperm 
lineages discussed in the text are highlighted: Monocotyledoneae (green), 
Magnoliidae (blue), Superrosidae (brown) and Superasteridae (yellow). 
Non-seed plant outgroups (that is, bryophytes, lycophytes and monilophytes) 
were removed for the purposes of visualization. 
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increases as conduit diameter increases*. Three evolutionary solutions 
seemingly arose to address the challenges of freezing: (1) woody species 
withstood freezing temperatures without serious loss of hydraulic func- 
tion by building safe water-transport networks consisting of small-diameter 
conduits; (2) woody species shut down hydraulic function by becom- 
ing deciduous, dropping leaves during freezing periods; and (3) herb- 
aceous species largely avoided freezing by senescing cheaply constructed 
aboveground tissue and overwintering, probably as seeds or underground 
storage organs. However, the order in which angiosperms are likely to 
have acquired these solutions relative to exposure to and persistence in 
the cold’® remains unclear. 

Proportions of herbaceous species, deciduous species and those with 
small water-conducting conduits increase towards the poles'*'”"®, and 
an earlier limited survey of angiosperm families indicated that herba- 
ceousness and ability to cope with freezing evolved in parallel’. However, 
exactly how global-scale ecological patterns are linked to functional evolu- 
tion of angiosperms is uncertain. We dissect the contributions of different 
evolutionary solutions allowing angiosperms to cope with periodic freez- 
ing and assess likely pathways by which clades acquired these traits (that is, 
timing of evolution in climate occupancy relative to trait evolution). 

We compiled a very large species-level database of angiosperm growth 
habits (49,064 species, which is 16.4% of accepted land plant species” 
in The Plant List; http://www.theplantlist.org), leaf phenology, conduit 
diameter and freezing climate exposure. To trace species trait and climate 
occupancy relationships over evolutionary time, we generated an unpar- 
alleled time-scaled molecular phylogeny for 32,223 land plant species 
in our database (Fig. 1; http://www.onezoom.org/vascularplants_tank 
2013nature.htm). This timetree gives us the most comprehensive view 
yet into the evolutionary history of angiosperms. On the basis of their 
geographic distributions, we classified species’ climate occupancies with 
respect to freezing: ‘freezing unexposed’, only encountering temperatures 
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Figure 2 | Coordinated evolutionary transition rates between leaf 
phenology or conduit diameter and climate occupancy. a, b, A 
representation of coordinated evolution for the best likelihood-based 

model between leaf phenology for 2,630 species (evergreen, dark green; 
deciduous, light green) and climate occupancy (freezing exposed (freezing), 
striped; freezing unexposed (not freezing), solid) (a), and conduit diameter for 
860 species (large (=0.044 mm), light blue; small (<0.044 mm), dark blue) and 
climate occupancy (b) based on models fit to all Angiospermae. The sizes of the 
black arrows in the plot are proportional to the transition rates between each 
possible state combination (larger arrows denote higher rates; no arrows for 
rates of 0). The number at the top of each panel denotes the number of extant 
Angiospermae species used in the analyses and percentages denote the 
percentage of extant species with that character state. The size of each circle is 
proportional to the persistence time in that state, where persistence time is 
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>0°C across a species’ range; and ‘freezing exposed’, encountering 
temperatures =0°C somewhere across a species’ range. This dichotomy 
assumes that climate tracking through environmental changes is more 
common than the evolution of climate occupancy; this is more likely to 
be true if freezing exposure has a physiological cost in regions without 
freezing”. Species were further distinguished by leaf phenology (deciduous 
or evergreen); conduit diameter (large =0.044 mm, or small <0.044 mm; 
as 0.044 mm diameter is the diameter above which freezing-induced 
embolisms are believed to become frequent at modest tensions”’); and 
growth form (woody or herbaceous, with woody species defined as 
those maintaining a prominent aboveground stem that is persistent 
over time and with changing environmental conditions; see Extended 
Data Fig. 1 for examples of angiosperms with woody growth habits as 
we define them, and Extended Data Table 1 for a breakdown of growth 
habit by order within angiosperms). 

Among woody species we asked whether evolutionary transitions 
between climate occupancy states were significantly associated with shifts 
in leaf phenology and/or conduit diameter. Among all angiosperms we 
asked whether evolutionary transitions between climate occupancy states 
were significantly associated with shifts in growth form. We determined 
the relative lability of climate occupancy (exposure to freezing) versus 
traits (growth form, leaf phenology or conduit diameter) by summing 
all climate occupancy transitions and dividing by the sum of all trait 
transitions. We also devised a novel summary based on these evolutio- 
nary transition rates that provides the likeliest pathways from the pur- 
ported early angiosperm (woody, evergreen, with large conduits and 
freezing unexposed) to a plant with traits for freezing conditions. 
Because evolutionary rates are unlikely to be uniform at this phylogenetic 
scale, we ran growth form analyses both across the entire angiosperm 
data set and also within each of four major lineages: Monocotyledoneae 
(monocots), Magnoliidae (magnoliids), Superrosidae (superrosids) 
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defined as the inverse of the sum of the transition rates away from a given 
character state (that is, the inverse of the sum of all arrow rates out of a character 
state). c, d, The relative likelihood of the different pathways out of the evergreen 
and freezing-unexposed state and into the deciduous and freezing-exposed 
state (c), and out of the large-diameter conduit and freezing-unexposed state 
and into the small-diameter conduit and freezing-exposed state (d). The three 
possible pathways between two focal character state combinations provide 
insight into whether lineages typically evolved: (1) with the trait first, such that 
phenology or conduit diameter shifted before encountering freezing; (2) with 
climate occupancy first, such that phenology or conduit diameter shifted after 
encountering freezing; or (3) both simultaneously, such that shifts in phenology 
or conduit diameter and encountering freezing happened at the same time (see 
Supplementary Information for further details). 
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and Superasteridae (superasterids) (see ref. 10 for lineage definitions); 
these clades represent ~ 22%, 3%, 34% and 34%, respectively, of all 
extant angiosperm species. 

Across woody angiosperms, a model that assumed coordinated evolu- 
tion of leaf phenology and climate occupancy was strongly supported 
over a model that assumed they evolved independently (Akaike infor- 
mation criteria (AAIC) = 310.1; Fig. 2a and Extended Data Table 2). 
Deciduous freezing-exposed and evergreen freezing-unexposed were 
highly persistent character states (Fig. 2a, as indicated by size of the 
circles, and Extended Data Table 3); persistence times (that is, expected 
time until state change) are defined as the inverse of the sum of estimated 
transition rates away from a given character state. Therefore, in the 
presence of freezing, the deciduous state was far more stable than the 
evergreen one. We also found that leaf phenology was generally about 
as labile as climate occupancy (climate:trait rate ratio = 0.845), and it 
was also far more likely to evolve as a response to a change in envir- 
onment rather than arising before encountering freezing (that is, cli- 
mate occupancy evolved first; Fig. 2c). 

Similarly, across woody angiosperms, a model assuming coordinated 
evolution of conduit diameter size and climate occupancy was strongly 
supported over a model that assumed they evolved independently 
(AAIC = 21.5; Fig. 2b and Extended Data Table 2). Both climate occu- 
pancy states (freezing exposed and freezing unexposed) in combina- 
tion with small conduits were highly persistent (Fig. 2b and Extended 
Data Table 3). Additionally, no species with large conduits were in the 
freezing-exposed state, indicating that this is a highly transitory char- 
acter state (that is, short persistence time). As with leaf phenology, 
climate occupancy and conduit diameter were similar in their overall 
lability (climate:trait rate ratio = 0.895); however, a shift into environ- 
ments with freezing temperatures was far more likely to occur after 
conduits had already shifted from large to small (that is, the trait evolved 
before climate occupancy; Fig. 2d). 

Evolutionary shifts in growth habit were also strongly coordinated 
with shifts in climate. However, the nature of coordination varied con- 
siderably among major angiosperm clades (Extended Data Table 3), as 
did overall transition rates (superrosids and superasterids > magnoliids 
> monocots). Of 104 models evaluated, a 40-parameter model allow- 
ing each major lineage to have its own transition matrix received most 
support (Extended Data Table 4). These results were generally robust 
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to uncertainty about whether species in the freezing-unexposed state 
actually lacked an ability to cope with freezing (Supplementary Informa- 
tion). Across angiosperms, asymmetry of transition rates led to numer- 
ous extant species in the woody freezing-unexposed and herbaceous 
freezing-exposed states (Fig. 3a and Extended Data Table 3). The large 
number of extant species in the woody freezing-unexposed state, accord- 
ing to our model, was the result of this state being persistent (Fig. 3a). 
Even within monocots, where relatively few woody species exist, the 
woody freezing-unexposed state was strongly persistent. The herbaceous 
freezing-exposed state, on the other hand, had low persistence times, 
indicating that the numerous extant species (N = 4,066 out of 12,706 
species for which data are available) were due to many rapid transitions 
both into and out of this character state (Fig. 3a). Climate occupancy was 
much more labile than growth form (climate:trait rate ratio = 4.93). 
Furthermore, the predominant pathway within angiosperms from the 
woody freezing-unexposed state to the herbaceous freezing-exposed 
state was to first evolve the herbaceous habit and subsequently enter 
habitats with freezing-exposed conditions (that is, the trait evolved 
before the climate occupancy; Fig. 3b). This, in combination with the 
conduit diameter results, suggests that lineages that successfully colo- 
nized new freezing environments were probably predisposed to do so, 
at least for these two traits. 

Although our focus here is on evolutionary links between species 
distributions with respect to freezing conditions and traits that allow 
species to cope with freezing, we note that differential diversification 
rates” and vagility among lineages also certainly played their parts in 
determining why we see species where we do today. For instance, herbs 
may have higher speciation and/or extinction rates than woody taxa™*. 
Additionally, growth form may influence a plant’s ability to disperse to 
and colonize newly emerging locations with freezing temperatures”. 
Tests of these alternatives are critical for fully understanding how angios- 
perms radiated into freezing environments, but such analyses require 
an even more complete record of global distributions of vagility and 
growth habit across land plants and a comparably more completely 
sampled phylogeny. These are non-trivial improvements as we cur- 
rently have growth habit data for only 16% of accepted land plants” 
(R.G.F. et al., manuscript submitted) and molecular and climate data 
for 26% (12,706 species) of those taxa. Total trait records are fewer for 
phenology (6,705 species) and conduit diameter (2,181 species). 


Figure 3 | Coordinated evolutionary transition 
rates between growth form and climate 
occupancy. a, A representation of coordinated 
evolution for the best likelihood-based model 
between growth form for 12,706 species 
(herbaceous, green; woody, brown) and climate 
occupancy based on a model assuming the same 
rates were applied to all Angiospermae (top plot 
above the dashed arrow), and the best-fit model, in 
which rates were estimated separately for the major 
lineages, that is, Monocotyledoneae, Magnoliidae, 
Superrosidae and Superasteridae (bottom four 
plots below the dashed arrows). b, The weighted 
average (by clade diversity) of the relative 
likelihood of the different pathways out of the 
woody and freezing-unexposed state and into the 
herbaceous and freezing-exposed state (see Fig. 2 
and Methods for further details). 
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Among three key angiosperm strategies successful in today’s freez- 
ing environments (deciduous leaves, small conduits and herbaceous 
habit), our analyses indicated two especially striking findings. First, the 
pathway to herbaceousness or small conduits in freezing environments 
largely involved acquisition of the trait first (followed by adaptation 
to a new climate), whereas the pathway to deciduousness in freezing 
environments was largely via a shift in climate occupancy first (fol- 
lowed by evolution of the trait). Second, transitions between growth 
habit states should be fairly simple genetically”’, involving suppression 
and re-expression of only a few genes”, and, traditionally, growth habit 
has been considered highly labile (ref. 6, but see refs 16, 28, 29). Our 
results are consistent with climate occupancy being more labile than 
growth habit, and freezing environments being largely filled by a subset 
of lineages that were already herbaceous or, if woody, had small con- 
duits before they encountered freezing. Why these lineages initially evolved 
a herbaceous habit and small conduit sizes remains unclear; these traits 
are probably tightly associated with responses to other environmental 
gradients (for example, aridity in the tropics) and numerous other aspects 
of a plant’s ecological strategy (for example, seed size, tissue defence, and 
so on) related to resource acquisition and disturbance regimes. Therefore, 
successful shifts between stem constructions take more than just turn- 
ing on or off a few genes. 

By weaving together a series of disparate threads encapsulating evolu- 
tion, functional ecology and the biogeographic history of angiosperms, 
including extensive functional trait databases and an exceptionally 
large timetree, we have documented the likely evolutionary pathways 
of trait acquisition facilitating angiosperm radiation into the cold. 


METHODS SUMMARY 


To examine the evolutionary responses to freezing in angiosperms, we first com- 
piled trait data on leaves and stems from existing databases and the literature. 
Growth form data came from numerous sources and were coded as a binary trait 
(woody or herbaceous; Supplementary Table 1). Leaf phenology and conduit dia- 
meter came from existing databases (see Supplementary Information for a list). 
Second, taxonomic nomenclature was made consistent among data sets and up to 
date by querying species names against the International Plant Names Index 
(http://www.ipni.org/), Tropicos (http://www.tropicos.org/), The Plant List (http:// 
www.theplantlist.org/) and the Angiosperm Phylogeny website (http://www.mobot. 
org/MOBOT/research/APweb/). Third, we obtained species’ spatial distributions 
from Global Biodiversity Information Facility records (http://www.gbif.org/; Sup- 
plementary Table 4) and then determined whether species encountered freezing 
temperatures using climate data from the WorldClim database (http://www.worldclim. 
org/). Fourth, we constructed a dated phylogeny for these species by downloading 
available GenBank sequences (http://www.ncbi.nlm.nih.gov/genbank/) for seven 
gene regions. Genetic data were compiled and aligned using the PHLAWD pipe- 
line (v.3.3a), and maximum-likelihood-based phylogenetic analyses of the total 
sequence alignment were performed using RAxML (v.7.4.1), partitioned by gene 
region and with major clades (that is, families and orders) constrained according to 
the APG III classification system. Branch lengths were time-scaled using congrui- 
fication, which involved using divergence times estimated from a reanalysis of a 
broadly sampled data set (Extended Data Fig. 2 and Supplementary Tables 2 and 3). 
Last, tests of coordinated evolution among traits in our database were analysed in 
the corHMM R package; transition rates between two binary traits were analysed 
using a likelihood-based model. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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Extended Data Figure 1 | Examples of the definition of ‘woody. a-d, We —_ Arizona, USA, c, Rhopalostylis sapida (Arecaceae) and Cyathea sp. 
defined ‘woody’ as having a prominent aboveground stem that is persistent (Cyatheaceae), Punakaiki, South Island, New Zealand. d, Pandanus sp. 
over time and with changing environmental conditions. a, Liriodendron (Pandanaceae), Moreton Bay Research Station, North Stradbroke Island, 
tulipifera (Magnoliaceae), Joyce Kilmer Memorial Forest, Robbinsville, North | Queensland, Australia (photographs by A.E.Z.). 

Carolina, USA. b, Carnegiea giganteana (Cactaceae), Biosphere II, Tucson, 
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indicated at the nodes with green circles, and numbers correspond to fossils 
described in Supplementary Table 2. Concentric dashed circles represent 
100-Myr intervals as indicated by the scale bar. 


Extended Data Figure 2 | Reference timetree used for congruification 
analyses. Results of the divergence time estimation of 639 taxa of seed plants 
from the reanalysis of a previously described"® phylogeny. Fossil calibrations are 
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Extended Data Table 1 | Number of species in different growth forms by clade 


Lineage Woody Herbaceous Total Proportion 
herbaceous 
Angiospermae 28650 17347 45997 0.38 
Magnoliidae 2438 75 2513 0.03 
Monocotyledoneae 1226 9894 11120 0.89 
Superasteridae 8468 4863 13331 0.36 
Superrosidae 14885 1956 16841 0.12 
ANA grade+Chloranthales 
Amborellales 1 0 1 0.00 
Austrobaileyales 48 0 48 0.00 
Chloranthales 18 7 25 0.28 
Nymphaeales 0 43 43 1.00 
Magnoliidae 
Canellales 71 0 71 0.00 
Laurales 1212 6 1218 0.00 
Magnoliales 1053 0 1053 0.00 
Piperales 102 69 171 0.40 
Monocotyledoneae 
Acorales 0 7 7 1.00 
Alismatales 3 513 516 0.99 
Arecales 793 0 793 0.00 
Asparagales 141 4133 4274 0.97 
Commelinales 0 180 180 1.00 
Dioscoreales 0 178 178 1.00 
Liliales 35 459 494 0.93 
Pandanales 80 17 97 0.18 
Petrosaviales 0 3 3 1.00 
Poales 109 4075 4184 0.97 
Zingiberales 61 329 390 0.84 
Basal eudicots+Gunnerales 
Buxales 31 0 31 0.00 
Ceratophyllales 0 3 3 1.00 
Gunnerales 2 14 16 0.88 
Proteales 1354 3 1357 0.00 
Ranunculales 134 488 622 0.78 
Trochodendrales 2 0 2 0.00 
Superasteridae 
Apiales 410 226 636 0.36 
Aquifoliales 211 0 211 0.00 
Asterales 548 1775 2323 0.76 
Berberidopsidales 3 0 3 0.00 
Bruniales 65 0 65 0.00 
Caryophyllales 545 712 1257 0.57 
Cornales 163 68 231 0.29 
Dilleniales 71 0 71 0.00 
Dipsacales 151 61 aie 0.29 
Ericales 2798 350 3148 0.11 
Escalloniales 23 0 23 0.00 
Garryales 17 0 17 0.00 
Gentianales 1508 280 1788 0.16 
Lamiales 1214 1035 2249 0.46 
Paracryphiales 20 0 20 0.00 
Santalales 242 20 262 0.08 
Solanales 254 200 454 0.44 
Superrosidae 
Brassicales 136 389 525 0.74 
Celastrales 228 11 239 0.05 
Crossosomatales 31 0 31 0.00 
Cucurbitales 62 169 231 0.73 
Fabales 2462 448 2910 0.15 
Fagales 745 0 745 0.00 
Geraniales 27 63 90 0.70 
Huerteales 8 0 8 0.00 
Malpighiales 2978 294 3272 0.09 
Malvales 1195 64 1259 0.05 
Myrtales 2787 79 2866 0.03 
Oxalidales 396 14 410 0.03 
Picramniales 16 0 16 0.00 
Rosales 1465 143 1608 0.09 
Sapindales 2082 vA 2089 0.00 
Saxifragales 190 246 436 0.56 
Vitales 42 1 43 0.02 
Zygophyllales 35 12 47 0.26 
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Number of species that are woody, number of species that are herbaceous, total number of species, and proportion of herbaceous species in major lineages and orders. Proportions in bold are lineages with >0.5 


species that are herbaceous. 
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Extended Data Table 2 | Coordinated evolutionary model fits for leaf phenology, conduit diameter and climate occupancy 


Leaf Phenology and climate occupancy 


Model Number of -InL AIC AAIC wi 
parameters 

Character independent 4 -2305.4 4618.9 312.8 <0.01 

Character dependent, equal rates 1 -2401.3 4804.5 498.4 <0.01 

Character dependent, all rates diff 8 -2160.0 4336.0 29.9 <0.01 

Character dependent, all rates diff* 12 -2141.1 4306.1 0 0.99 

Conduit diameter and climate occupancy 

Model Number of -InL AIC AAIC wi 
parameters 

Character independent 4 -603.65 1223.3 21.5 <0.01 

Character dependent, equal rates 1 -739.8 1481.6 279.8 <0.01 

Character dependent, all rates diff 8 -592.91 1201.8 ) 0.98 

Character dependent, all rates diff* 12 -592.91 1209.8 8.0 0.02 


The likelinood-based best model in each case (shown in bold italics) was chosen based on both AIC and Akaike weights (w;). Also listed for each model are the number of parameters, negative log likelihood (—InL), 
and AAIC. The asterisk indicates a model where simultaneous changes in any two binary characters were allowed to change. 
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Extended Data Table 3 | Coordinated evolutionary model transition rates 
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Leaf Phenology and climate occupancy Conduit Diameter and climate occupancy 
Angiospermae Angiospermae 
transition transition 
Transition rates Transition rates 
EVERGREEN EXPOSED--EVERGREEN UNEXPOSED 0.051 LARGE EXPOSED-+LARGE UNEXPOSED 100.0 
(0.042,0.065 0.000, 100.0) 
DECIDUOUS UNEXPOSED->EVERGREEN UNEXPOSED 0.053 SMALL UNEXPOSED->LARGE UNEXPOSED 0.005 
(0.053,0.097 0.003,0.041) 
DECIDUOUS EXPOSED->EVERGREEN UNEXPOSED 0.005 SMALL EXPOSED->LARGE UNEXPOSED 0.000 
(0.004,0.006 (na,na) 
EVERGREEN UNEXPOSED->EVERGREEN EXPOSED 0.011 LARGE UNEXPOSED-+LARGE EXPOSED 0.033 
(0.001,0.014 0.000,0.190) 
DECIDUOUS UNEXPOSED--EVERGREEN EXPOSED 0.0023 SMALL UNEXPOSED->LARGE EXPOSED 0.000 
(0.000,0.003 (na,na) 
DECIDUOUS EXPOSED-+EVERGREEN EXPOSED 0.018 SMALL EXPOSED--LARGE EXPOSED 0.000 
(0.012,0.019 0.000,0.000) 
EVERGREEN UNEXPOSED->DECIDUOUS UNEXPOSED 0.008 LARGE UNEXPOSED->SMALL UNEXPOSED 0.096 
(0.008,0.012 (0.065,1.07) 
EVERGREEN EXPOSED-+DECIDUOUS UNEXPOSED 0.0000 LARGE EXPOSED->SMALL UNEXPOSED 0.000 
(0.000,0.001 (na,na) 
DECIDUOUS EXPOSED-+>DECIDUOUS UNEXPOSED 0.002 SMALL EXPOSED->SMALL UNEXPOSED 0.0353 
(0.001,0.002 (0.026,0.038) 
EVERGREEN UNEXPOSED->DECIDUOUS EXPOSED 0.001 LARGE UNEXPOSED->SMALL EXPOSED 0.000 
(0.000,0.001 (na,na) 
EVERGREEN EXPOSED-+DECIDUOUS EXPOSED 0.0116 LARGE EXPOSED->SMALL EXPOSED 100.00 
(0.009,0.014 (0.000,100.0) 
DECIDUOUS UNEXPOSED->DECIDUOUS EXPOSED 0.0116 SMALL UNEXPOSED->SMALL EXPOSED 0.0225 
(0.010,0.019 (0.017,0.026) 
Growth habit and climate occupancy 
Monocotyledonae Magnoliidae Superrosidae § Superasteridae Rest 
transition transition transition transition transition 
Transition rates rates rates rates rates 
WOODY EXPOSED->WOODY UNEXPOSED 0.044 0.126 0.030 0.041 0.021 
(0.05,0.159) (0.045,0.112) 0.027,0.035) (0.031 ,0.049) (0.007,0.020) 
HERBACEOUS UNEXPOSED->WOODY UNEXPOSED 0.001 0.002 0.049 0.052 0.000 
(0.000,0.001) (0.000,0.010) 0.041,0.065) (0.055,0.076) 0.000,0.003) 
HERBACEOUS EXPOSED->WOODY UNEXPOSED 0.000 0.000 0.000 0.000 0.000 
(na,na) (na,na) (na,na) (na,na) (na,na) 
WOODY UNEXPOSED->WOODY EXPOSED 0.005 0.017 0.001 0.0189 0.028 
(0.008,0.027) (0.008,0.019) 0.009,0.012) (0.016,0.024) (0.016,0.031) 
HERBACEOUS UNEXPOSED->WOODY EXPOSED 0.000 0.000 0.000 0.000 0.000 
(na,na) (na,na) (na,na) (na,na) (na,na) 
HERBACEOUS EXPOSED-->WOODY EXPOSED 0.001 0.016 0.008 0.012 0.001 
(<0.001,0.001) (0.001,0.021) 0.007,0.009) (0.011,0.013) (<0.001,0.003) 
WOODY UNEXPOSED->HERBACEOUS UNEXPOSED 0.001 0.001 0.002 0.004 <0.001 
(0.000,0.001) (<0.001,0.001)  (0.001,0.002) (0.002,0.005) (0.000,<0.001) 
WOODY EXPOSED--HERBACEOUS UNEXPOSED 0.000 0.000 0.000 0.000 0.000 
(na,na) (na,na) (na,na) (na,na) (na,na) 
HERBACEOUS EXPOSED--HERBACEOUS UNEXPOSED 0.0483 0.003 0.024 0.045 0.003 
(0.037,0.086) (<0.001,0.036) 0.017,0.036) (0.028,0.062) (0.003,0.022) 
WOODY UNEXPOSED->HERBACEOUS EXPOSED 0.000 0.000 0.000 0.000 0.000 
(na,na) (na,na) (na,na) (na,na) (na,na) 
WOODY EXPOSED->HERBACEOUS EXPOSED 0.007 0.000 0.002 0.004 0.003 
(0.002,0.019) (0.000,0.003) 0.001,0.005) (0.002,0.005) (0.002,0.004) 
HERBACEOUS UNEXPOSED->HERBACEOUS EXPOSED 0.060 0.015 0.090 0.147 0.033 
(0.056,0.129) (0.011,0.042) 0.050,0.139) (0.101,0.232) (0.031,0.304) 
The estimated transition rates for the best likelihood-based evolutionary transitions model between climate occupancy and either growth habit, leaf phenology or conduit diameter evolution are included. The 
numbers in parentheses denote the values at the 2.5% and 97.5% quantiles of the distribution of parameter estimates obtained from the same analyses run on the 100 bootstrapped trees (see Supplementary 
Information). The leaf phenology model includes transitions between combinations of leaf phenology (evergreen, deciduous) and climate occupancy (freezing exposed, freezing unexposed), the conduit diameter 


model includes transitions between combinations of conduit diameter (large =0.044 mm, small <0.044 mm) and climate occupancy, and the growth habit model includes transitions between combinations of 


growth form (herbaceous, woody) and climate occupancy. Arrows denote the direction of the transition. The growth habit model assumes se 
Monocotyledonae, Magnoliidae, Superrosidae, Superasteridae and all remaining angiosperms (the rest), including the ANA grade, Chloranth 
phenology and conduit diameter models assume a single model for all angiosperms. 
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Extended Data Table 4 | Coordinated evolutionary model fits for 
growth form and climate occupancy 


Model Number of parameters -Inb AIC AAIC W; 

ABCDE 40 -8348.9 16777.9 0 0.999 
AABCD 48* -8347.7 16791.3 13.4 <0.001 
AABCD 32 -8353.9 16794.4 16.5 <0.001 


The top three of 104 likelihood-based models tested for growth form and climate occupancy 
evolution are reported. The best model, based on both AIC and Akaike weights (w;), was a model that 
assigned a separate rate for the Monocotyledonae (position 1), Magnoliidae (position 2), Superrosidae 
(position 3), Superasteridae (position 4) and all remaining angiosperms, including the ANA grade, 
Chloranthales, Ceratophyllales and basal eudicots plus Gunnerales (position 5), respectively. Also listed 
for each model are the number of parameters, negative log likelihood (—InL), and AAIC. The asterisk 
indicates a model where simultaneous changes in any two binary characters were allowed. 
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decision making in scientific peer review 


In-Uck Park", Mike W. Peacey’? & Marcus R. Munafo**® 


The objective of science is to advance knowledge, primarily in two 
interlinked ways: circulating ideas, and defending or criticizing the 
ideas of others. Peer review acts as the gatekeeper to these mechan- 
isms. Given the increasing concern surrounding the reproducibility 
of much published research’, it is critical to understand whether 
peer review is intrinsically susceptible to failure, or whether other 
extrinsic factors are responsible that distort scientists’ decisions. 
Here we show that even when scientists are motivated to promote 
the truth, their behaviour may be influenced, and even dominated, 
by information gleaned from their peers’ behaviour, rather than by 
their personal dispositions. This phenomenon, known as herding, 
subjects the scientific community to an inherent risk of converging 
on an incorrect answer and raises the possibility that, under certain 
conditions, science may not be self-correcting. We further dem- 
onstrate that exercising some subjectivity in reviewer decisions, 
which serves to curb the herding process, can be beneficial for the 
scientific community in processing available information to estim- 
ate truth more accurately. By examining the impact of different 
models of reviewer decisions on the dynamic process of publication, 
and thereby on eventual aggregation of knowledge, we provide a 
new perspective on the ongoing discussion of how the peer-review 
process may be improved. 

Current incentive structures in science promote attempts to publish 
in prestigious journals, which frequently prioritize new, exciting find- 
ings. One consequence of this may be the emergence of fads and 
fashions in the scientific literature (that is, ‘hot topics’)', leading to 
convergence on a particular paradigm or methodology. This may not 
matter if this convergence is on the truth—topics may simply cease to 
be hot topics as the problem becomes solved. However, there is 
increasing concern that many published research findings are in fact 
false’. It is common for early findings to be refuted by subsequent 
evidence, often leading to the formation of groups that interpret the 
same evidence in notably different ways’, and this phenomenon is 
observed across many scientific disciplines**. There are a number of 
relatively recent examples of convergence on false hypotheses, such as 
the theory of stress causing gastric ulcer formation’. Once established, 
these can become surprisingly difficult to refute-—they may become 
“more ‘vampirical’ than ‘empirical’—unable to be killed by mere evid- 
ence”. Science may therefore not be as self-correcting as is commonly 
believed’, and the selective reporting of results can produce literatures 
that “consist in substantial part of false conclusions”. 

It is important to understand how convergence on false conclusions 
may come about. A number of possibilities present themselves. First, 
scientists may not in fact be rational individuals pursuing the truth 
after all—an argument made by some influential sociologists of science 
(the strong programme)'’—or may be rational but stuck within a par- 
ticular paradigm". Second, some scientists may be biased or even 
immoral—a number of high profile cases of data fabrication and fraud 
have emerged in recent years’’. Third, some scientists may care more 
about publication and careers than discovering the truth (thatis, ‘publish 


or perish’), a process which may be conscious or unconscious’*. In 
competitive fields current incentive structures prioritize positive results, 
which may increase the likelihood of modification of data or conducting 
many statistical tests to achieve these; similarly, increased error rates 
may arise from multiple competing research groups testing the same 
hypotheses™*. 

It has been shown that increased popularity of a particular research 
theme reduces the reliability of published results'*, and that findings 
published in prestigious journals are less reliable and more likely to be 
retracted’*. Therefore, the convergence of research interest on a current 
hot topic may serve to undermine the reliability and veracity of sub- 
sequently published findings. In principle, peer review should elim- 
inate or reduce these problems but, given empirical evidence for the 
unreliability of much published research, it may not in fact be con- 
ducted properly, or the process itself may be flawed. Empirical research 
and simulations have identified a number of factors which contribute 
to the likelihood that a published research finding is false’®. However, 
the peer-review process itself has not been closely investigated as a 
possible influence, despite the fact that it acts as the ultimate gatekeeper 
of research publication. It is generally regarded as imperfect, although 
still the best model available to ensure both the quality and veracity of 
published scientific research, but there has been growing concern that 
it fails, at least in part, with respect to each of these two goals’. 

To understand the peer-review mechanism better, using a Bayesian 
approach in a model of the publication process, we analysed the beha- 
viour of scientists who have developed their initial opinions indepen- 
dently as to which of the two opposing hypotheses, A and B, is more 
likely to be true. They know that on average their opinion is indeed 
correct with probability Pe (; a) , $o they feel confident, but less than 
fully, about their opinion. The more controversial the issue, the lower 
the value of 6. Upon receiving a manuscript that advocates one of the 
hypotheses, the editor of a hypothetical journal solicits a review from 
another scientist, who recommends acceptance or rejection. To focus 
on the influence of reviewer behaviour, rather than that of editor, we 
assume that the editor simply follows the reviewer's recommendation. 
Subsequently, the reviewer writes and submits their own manuscript 
to the journal, and the process repeats. The two decisions for each 
scientist are therefore: (1) whether or not to recommend acceptance 
of a manuscript that they are reviewing, and (2) which hypothesis to 
advocate in their own submission, which we term the ‘theme’ of their 
manuscript. As a publication history evolves (following cycles of sub- 
mission, peer review and acceptance or rejection) a scientist revises 
their view on the likelihood of each hypothesis being true, in light of 
the relative probability of this particular history occurring when one 
hypothesis is true as opposed to the other. Being motivated to promote 
the truth, each scientist will advocate a theme that is more likely to be 
true, according to their revised view when they submit a manuscript. 

Our aim was to understand how different criteria of reviewing decisions 
influence the publication outcome, and how the resulting publication 
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histories and the information inherent in the relevant peer-review cri- 
terion influence the community’s eventual understanding of the topic. 
To this end we modelled and compared two different ways that scien- 
tists approach the reviewing decision. In the first model (M1), the sub- 
jective criterion of how strongly the reviewer agrees with the conclusion 
of the research (that is, the theme of the manuscript) is reflected in the 
decision, in addition to other more objective criteria such as research 
design and methodology. In the second model (M2), the decision 
reflects objective criteria only. Our findings, therefore, may shed light 
on whether subjective assessment is desirable in the peer-review process 
and, if so, to what extent. As a benchmark, we also compared M1 and 
M2 with a default model (M3), in which all manuscripts are published 
without any filtering through peer review. As scientists will make infer- 
ences that take into account how reviewers arrive at their recommenda- 
tions, the particular peer-review model in operation affects how they 
revise their views and, thereby, their decisions on which theme to advoc- 
ate as an author, as well as their decisions as a reviewer. 

The results of the three models (Fig. 1) indicate that: (1) almost cer- 
tainly, some scientists will submit manuscripts on themes which disagree 
with their initial opinion (we term this ‘herding’); (2) the extent to which 
the wider scientific community’s perception of a literature is removed 
from the truth (we term this ‘misperception’) decreases with number of 
publications, but information transmission is greatly hampered once 
herding has occurred, to such an extent that no further improvement 
in understanding occurs except in M1 where a degree of subjectivity is 
allowed in the reviewing decision (that is, reviewers as well as authors act 
guided by Bayesian inference); and (3) the probability of another pub- 
lication on a particular issue increases as the number of manuscripts 
published on that issue increases, owing to aggregation of information 
and herding reinforcing the scientific community’s consensus. 

The phenomenon known as herding is inherent in the behaviour of 
scientists operating under all of the models we consider. An individual 
is said to be herding if they choose a theme to advocate in their manu- 
script submission based entirely on what they have observed from 
others, independently of what they initially thought was true. The 
degree of herding depends on the peer-review model in operation, 
the number of manuscripts submitted so far, and how confident scien- 
tists feel about their initial opinion (f). Herding takes place relatively 
quickly (Fig. 1), and we observe discrete jumps in the measure of 
herding early on in the process, when each signal (that is, the informa- 
tion carried by a peer-review decision) carries a large weighting. 
Notably, the probability of herding and the speed with which it 
increases are eventually lower when a degree of subjectivity is allowed 
in the reviewing decision (M1), and only in this case can a fad be 


reversed following a sequence of publications on the same theme. As 
a fad persists, the total number of scientists required in order to reverse 
this fad increases—and at a faster rate. 

We use ‘misperception’ to describe how incorrect the perception of 
the wider scientific community is after a history of publication outcomes. 
It is defined as the probability that an outsider assigns to a hypothesis 
being correct, based on Bayesian inference from the observed history, 
when it is actually incorrect. The level of expected misperception (Fig. 1) 
remains relatively stable for low and high values of f, but for intermedi- 
ate values of f it declines with increasing numbers of submitted manu- 
scripts. Critically, when a degree of subjectivity is allowed in the peer-review 
process (M1), this always eventually outperforms the other models, 
because in these models information completely fails to be transmitted 
after herding occurs. 

In our models, manuscript submission decisions made by individual 
scientists are based in part on information inferred from others’ actions, 
because individuals use information from the publication history within 
a particular field, as well as their personal opinions, to guide their deci- 
sions. This may have positive effects if the decisions cluster around a 
correct outcome, or have negative effects if they cluster around an incor- 
rect outcome. A degree of subjectivity in the peer-review process will, on 
average, lead to lower misperception, because reviewer decisions (and 
subsequent editorial decisions) which go against the herding trend will 
continue to reveal new information. In addition, the process is dynamic, 
and we show that self-correction can eventually occur when a degree of 
subjectivity is allowed in the peer-review process; however, it may not 
when the reviewing decision is completely independent of the reviewer's 
subjective assessment of the theme of the manuscript, and is based only 
on other, largely objective characteristics of the manuscript, such as the 
quality of the research methodology. In this case the probability of 
herding reaches 1 within finite time for all values of f, and the level of 
misperception cannot go belowa certain lower bound. The concept ofherd- 
ing has been discussed in the context of scientific research in the past’’, 
but ours is the first study, to our knowledge, to model the processes by 
which it may occur. 

These results raise the question of whether a higher level of subject- 
ivity in reviewer decisions will lead to more effective restraint of incor- 
rect herding. We therefore decided to test generalized M1 models, in 
which we varied the degree to which the reviewer’s recommendation is 
determined by their subjective assessment of the conclusion. Our results 
(Fig. 2) indicate that excessively subjective reviews are not effective in 
restraining incorrect herding. This is because, in this case, recommen- 
dations are sensitive to whether the conclusion agrees with the reviewer's 
viewpoint at that time, and this factor is predominantly determined by 
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Figure 2 | Expected misperception in a generalized version of the M1 model. 
We show the expected misperception for three values of the probability that the 
initial opinion is correct (f): (1) 0.55 (left), (2) 0.75 (middle), and (3) 0.95 
(right), reflecting high, intermediate, and low uncertainty. Results are shown 
for differing degrees to which the reviewer’s subjective assessment determines 
their recommendation (v): (1) 0.75 (red, solid line), (2) 1.00 (green, long dashed 


the accumulated information, rather than their original opinion, as 
publication history lengthens. In other words, in this case even the 
reviewers’ recommendations are subject to herding. It appears that a 
moderate degree of subjectivity (as depicted in M1) is near-optimal. 

Two empirical examples show that herding occurs in the scientific 
literature. First, belief in a specific scientific claim can be (and is) dis- 
torted through preferential citations of studies which support a particu- 
lar point of view rather than those which do not’. This phenomenon 
can be attributed to herding caused by preferential citations, potentially 
creating a spurious and unfounded sense of authority for specific claims. 
Second, using a meta-analytic review of a recent literature'*, we com- 
pared claims made in the abstracts of the contributing studies with 
support for those claims by the data reported therein. Meta-analysis 
imposes a standard analysis to maximize comparability, and thereby 
minimizes the extent to which the presentation of results can be influ- 
enced by flexible analytical options’’. These results (Fig. 3) show a 
mismatch between the claims made in the abstracts, and the strength 
of evidence for those claims based on a neutral analysis of the data, 
consistent with the occurrence of herding. 

We next consider whether scientists can decide on their conclusion 
before conducting an experiment. We suggest that herding leads to one 
outcome being preferable over another, and that flexible analysis and 
selective reporting allows data that do not conform to either be trans- 
formed”? or relegated to the file drawer”. Mendel famously appears to 
have dropped observations from his data so that his results conformed to 
his expectations”’, but because his theory was ultimately proved correct 
this is now generally overlooked. There is in fact clear evidence that the 
reporting and interpretation of findings is often inconsistent with the 
actual results”, and this appears to be particularly pronounced in 
abstracts of research articles (often the only part that is read)”. 

Scientists may be motivated by a number of factors, such as the desire 
to be the first to advocate an idea, and the natural tendency to side with 
others of a similar opinion. Herding is therefore expected when agents 
care only about being published and recognize some topics as ‘hot’ (and 
therefore publishable). If scientists are motivated in this way in our 
model, then in an equilibrium of the game they can simply follow the 
first author’s claim to maximize the probability of being published (see 
Supplementary Information). However, our results indicate that we can 
expect herding, including convergence on false conclusions, even when 
scientists—both as authors and reviewers—are rational and motivated 
by the pursuit of truth. The emergence of fads and fashions in the 
scientific literature (that is, hot topics)’ is therefore unsurprising. 

The first herding model in economics modelled individuals’ invest- 
ment choices™. Herding may have positive consequences, by driving 
rapid convergence on a correct decision. Rational individuals process 
all the information available to them before making decisions, and 
herding therefore arises from natural motives—a rational individual 
in pursuit of truth can and should be influenced by what others think. 
That humans are influenced in this way has been shown by experiments 


Number of submissions 


= 0.500 ; 7 < 0.030 

& 0.498 | p=055 § Ss \ B=0.95 

2 0.496 | 8. B 0.025) | 

© 0.494 | ° ae \ 

zt 0.490 | : a oo1s| 4 

= o.ase | E E \ 

3 0.486 | = 3 B 0.010 . 

eo SS 3 3 0.005} Se 

ti 0.480 1 ty 0.100 : : 1 4 0.000 
0 5 10 15 20 0 5 10 15 20 0 5 10 15 20 


Number of submissions 


line), (3) 1.25 (blue, short dashed line), and (4) 1.50 (black, dotted line). In the 
original M1 model v = 1, while lower values reflect a more objective reviewer, 
and higher values a more subjective reviewer. Excessively subjective reviews are 
not effective in restraining incorrect herding (this is not yet visible for f = 0.55, 
but would become apparent with more submissions). 


in social psychology”. It is rational because humans are aware of their 
own fallibility, and so their opinions may be strengthened or weakened 
by the views of others. In other words, being aware of the wisdom of the 
crowd, humans are (rationally) influenced by the crowd; in order to 
update our beliefs in the light of new evidence, we should be guided by 
Bayes’ theorem. However, herding may also have negative conse- 
quences, by driving convergence on an incorrect decision. This is par- 
ticularly problematic if an outsider to the process is unaware that it is 
taking place, as it gives a spurious sense of certainty to the observed 
convergence. 

Free, open and global access to research reports has been proposed 
as an alternative to peer review (http://am.ascb.org/dora/), but, as we 
have shown, peer review can reveal more information relative to free 
and complete sequential publication. Reviewer recommendations, and 
resulting editor decisions, contain information, and thus prevent herd- 
ing from completely blocking new information flow. However, this 
depends on specific parameters such as the popularity of the subject 
(for example, how many people are writing about this issue, or how 
long it is discussed) and how strongly scientists feel about their initial 
dispositions (that is, the level of /). In particular, if reviewers (and 
editors) are explicitly encouraged to be as objective as possible they will 
not be guided by Bayes’ theorem when making their recommenda- 
tions—it is only when reviewers are allowed a degree of subjectivity that 
this is done. Our results indicate that peer review performs best when the 
reviewers exercise their subjectivity at an intermediate level; higher levels 
enhance the risk of complete herding in reviewer decisions, whereas 
lower levels curb the information flow from reviewer decisions. 

The peer-review process is therefore in principle self-correcting over a 
sufficiently extended period (although distortions may occur in the 
shorter term), in that de-herding can also occur. In reality, de-herding 
will not always occur, because publication histories within a topic may not 
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Figure 3 | Empirical evidence of discrepancy between claims and results. 
We show claims made in the abstracts of studies, and the results of those studies 
derived from a standardized analysis. Abstracts were coded as pro or con 
depending on whether an association was claimed, based on the judgement of an 
independent rater. Results were coded as pro or con depending on whether the 
overall effect size for the full sample in the study was statistically significant at 
P<0.05. Five abstracts could not be coded as either pro or con. The proportion of 
pro versus con classifications differed for claims (80% pro) and results (44% pro), 
suggesting herding around the first published claim (McNemar test: P = 0.016, 
two-tailed test ). Treating abstracts that could not be coded as pro or con did not 
alter these results substantially (84% versus 64% pro). 
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persist for sufficiently long. Science may therefore not be as self-correcting 
as is commonly assumed’, and peer-review models which encourage 
objectivity over subjectivity may reduce the ability of science to self- 
correct. Although herding among agents is well understood in cases 
where the incentives directly reward acting in accord with the crowd 
(for example, financial markets), it is instructive to see that it can occur 
when agents (that is, scientists) are motivated by the pursuit of truth, and 
when gatekeepers (that is, reviewers and editors) exist with the same 
motivation. In such cases, it is important that individuals put weight 
on their private signals, in order to be able to escape from herding. 
Behavioural economic experiments indicate that prediction markets, 
which aggregate private signals across market participants, might provide 
information advantages”*. Knowledge in scientific research is often highly 
diffuse, across individuals and groups”, and publishing and peer-review 
models should attempt to capture this. We have discussed the importance 
of allowing reviewers to express subjective opinions in their recommen- 
dations, but other approaches, such as the use of post-publication peer 
review, may achieve the same end. 


METHODS SUMMARY 

Model. A number of scientists, indexed as i= 1,2, - - -, deliberate over two opposing 
hypotheses perceived ex ante to be equally likely to be true. Initially each scientist 
i receives an independent private signal regarding the true hypothesis, which is 


1 
correct with probability fe (; a) . Sequentially, scientist i submits a manuscript 


defending one of the two hypotheses, termed its theme, which is reviewed by the 
next scientist i + 1 who decides whether to accept or reject the manuscript. This 
decision, and the theme if accepted, becomes common knowledge. Each scientist 
submits a manuscript defending a theme that is more likely to be the true hypo- 
thesis according to their posterior belief, formed by Bayes’ rule based on all the 
information available at that time. We consider three models of reviewer decision. 
In M1, the reviewer accepts a manuscript with a probability proportional to the 
likelihood of its theme being true according to their posterior belief. In M2, they 
accept it irrespective of its theme with the ex ante probability they would accept a 
manuscript after the same publication history in M1. In M3, they simply accept it. 
Concepts. A scientist is herding if their posterior belief attaches a probability 
greater than 0.5 to a particular hypothesis regardless of their own signal when 
they submit. Their probability of herding is the ex ante probability that they will be 
herding. The misperception after a publication history is the expected probability 
attached to the hypothesis, which is in reality incorrect, by outside observers who 
form their posterior beliefs on true hypothesis by Bayes’ rule based on the history. 
The expected misperception after n submissions is the probability-weighted sum 
of misperceptions over all possible histories that may occur with n submissions. 
Analysis. We wrote a computer program to recursively calculate numerical values 
of algebraic formulae for various concepts reported, and algebraically derived 
asymptotic properties for large numbers of submissions. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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METHODS 


Model of the peer review process. We analyse a model in which n + 1 ex ante 
identical scientists deliberate over two opposing hypotheses, labelled A and B. It is 
known that only one of these hypotheses is correct, and that ex ante both are 
equally likely to be correct. Denoting the correct hypothesis by t, this is expressed 


1 
as P(t=A) =P(t=B) = =. Before the game starts, each scientist i receives a 


private signal, s;¢{A,B}, regarding which is the true hypothesis. These signals 
are independent random variables that assume a value equal to the correct hypo- 
thesis with probability £. The signals are informative but not perfect, that is, 


1 
Be a ,1 }. Lower values of f can be interpreted as reflecting a more controversial 


nature of the issue under question, when the signals tend to be less accurate. 

Sequentially, and motivated to publish what is true, different scientists submit a 
manuscript, each defending a particular hypothesis. The ‘theme’ of scientist 7s 
manuscript, t;¢{ A,B}, denotes the hypothesis that is defended. We postulate that, 
upon receiving a manuscript, the editor elicits peer review from a scientist whose 
stance on the topic is unknown to the editor, which eliminates the editor’s influ- 
ence on the editorial decision through reviewer selection. This is done to focus our 
analysis on reviewer behaviour, and means that in our model each manuscript is 
assigned to a scientist who has neither submitted their own manuscript nor acted 
as a reviewer at that point (because otherwise the editor would have inference on 
their stance from the theme of their submission or their previous decision as a 
reviewer). The editor follows the reviewer’s recommendation in deciding whether 
to accept or reject the manuscript. If it is accepted, its theme becomes common 
knowledge; if it is rejected, the theme is not disclosed, but the rejection becomes 
common knowledge. Then, a new submission is made by a scientist who has not sub- 
mitted before. In particular, our analysis is focused on the case that the next scientist 
who submits a manuscript is the one who reviewed the previous manuscript. 

Thus, labelling the scientist who writes the i-th submission as i, each scientist 
ie{1,2,...,n} sequentially submits a manuscript advocating a theme t;€{A,B}, 
which is reviewed by the next scientist j= i+ 1, who subsequently writes and 
submits their own manuscript. Scientist n + 1, who also receives a signal sy,+1, 
only reviews. Scientists observe the history of publication outcomes as they arise. 
Let h'e{A,B,@}' denote a history of the first i publication outcomes, where each 
published manuscript is recorded by its theme, A or B, and each unpublished 
manuscript by @. Then, there are three items of information available to each 
scientist j when they make decisions: (1) their own private signal sj¢{A,B}; (2) a 
manuscript to be reviewed with a theme t;_,¢{A,B} if j>1; and (3) a history 
hi-*e{A,B,@ if j>2. The two decisions to make are whether or not to 
recommend acceptance of a manuscript that they are reviewing, and the theme 
of the manuscript they subsequently submit. 

We made a few modelling choices that simplify real practices, namely that: 

(1) only one reviewer is consulted for each submission; (2) the current reviewer 
is the next author; (3) rejections become common knowledge; and, (4) authors 
conform to the rationality assumption that they are Bayesian updaters. Choices 1 
and 2 maximize the number of submissions that can be reviewed by a given 
number of scientists, subject to the editor not soliciting a review from someone 
with a known stance. Choice 3 spares scientists from having to make probabilistic 
inferences as to what other submissions might have been made but rejected, which 
would have been necessary to determine the optimal choices when they act. These 
features enable us to examine the largest possible number of submissions with the 
available computing power, and thus allow us to generate more meaningful out- 
puts without changing the essential processes operating. We believe that our main 
message will remain valid when these assumptions are relaxed (see Supplementary 
Information for a further discussion of choice 3). However, the complexity of the 
computer program needed to analyse such cases, and the corresponding comput- 
ing power required, will increase exponentially. Choice 4 assumes authors use all of 
the information available to them, in accordance with Bayes’ theorem”’, to deter- 
mine the relative likelihood (called a posterior belief) that each of the two alterna- 
tive hypotheses is correct. Then, being motivated to publish what is true, each 
scientist will submit a manuscript advocating the hypothesis that is more likely to 
be correct according to their posterior belief, augmented by a standard tie-breaking 
rule of following their own signal when both are equally likely**. This is one of the 
rationality assumptions that economists place on humans. 
Models of reviewer behaviour. In the first model, M1, scientist j = i + 1 recom- 
mends acceptance of scientist 7s manuscript with the same probability, denoted by 
P(t = t,| Bh ~? tsi), that they infer the theme of the manuscript to be the correct 
hypothesis, by Bayes’ rule based on all the information available to them at that 
point. Therefore, reviewers as well as authors act guided by Bayesian inference in 
this model. The acceptance probabilities are endogenous and evolve differently 
depending on how the publication history unfolds. 
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In the second model, M2, the acceptance decision is completely independent of 
the reviewer's subjective assessment of the theme of the manuscript, and rather is 
based on other, largely objective characteristics of the manuscript, such as the 
quality of the research methodology. Presuming that these traits are statistically 
independent of the manuscript’s conclusion, the acceptance probabilities in M2 
are independent of both the theme of the manuscript and the assigned reviewer 
(insofar as the only feature that distinguishes reviewers is their assessment of 
which hypothesis is correct). Thus, the acceptance probabilities can be thought 
of as the likelihood that the methodological quality of the manuscript is sufficient 
to warrant publication, and not a reflection of whether or not the reviewer agrees 
with the conclusions. However, our model does not specify what those probabil- 
ities should be. To aid comparison between the models, we considered two cases. 
In one, scientist j, irrespective of their own signal, recommends acceptance of i’s 
manuscript with a probability equal to the ex ante probability that they would 
recommend acceptance of is manuscript in M1 after the same history (this results 
in the same expected number of publications in both M1 and M2). In the other, the 
acceptance probability remains the same throughout, at the initial expected 
acceptance probability of the M1 model, which is f. To verify this, note that 


scientist 2 would recommend acceptance of scientist 1’s manuscript with prob- 
2 


ability erg when s2 agrees with ft; (which happens with probability 
f? +(1—f)’ =1—28 +26) but with probability 0.5 otherwise. Hence, the 
expected probability of acceptance is B* + 5(26 —2f’) =B. As the results are 
similar in the two cases of M2, here we report only on the former. 

In the third (benchmark) model, M3, all manuscripts are published without any 
filtering through peer review. This model is identical to M2 but with the accept- 
ance probability equal to 1 throughout the process. This is a simple model of herd 
behaviour**** that has become standard in economics when modelling self-motivated, 
rational individuals who sequentially take actions. A consequence of this model is 
that each scientist will have access to all previous submissions when forming their 
decision (because everything is published in this model). Note that this differs from 
a full information case (that is, where every scientist has access to all private signals, 
as well as public actions). 

In the generalized M1 models, scientist j recommends acceptance with prob- 


BW? ,t:,5)) _ :)} if P(t=t BW? ,t;,5;) = ° 


1 F 1 
and with probability max 0, 1 +v (P = ti|p,hi~?,ti)) _ ;) \ if P(t=t;|B, 


ability min{ 1, ; +v (P =t; 


F 1 
hi-? atinS)) < - where v > 0. The case v = 1 corresponds to the original M1 model, 


with higher values of v indicating that the recommendation is more heavily influ- 
enced by the reviewer’s subjective assessment on the advocated theme, and lower v 
meaning that it is less so. 

Definitions and algebraic formulae. The misperception is defined from the 
perspective of outsiders who observe the publication history. Using all the 
information available to them from the observed history, h"e{A,B,@}", outside 
observers will form via Bayes’ rule a posterior belief that attaches probability 

ny ___Ph"|t) 

PCI) = Sanya) + PTB) 
P(h"|rt) is the probability that the history h” realizes under hypothesis te{ A,B}. 
We define the misperception, after history h”, as the expected posterior probability 
alaes to the hypothesis which is in reality incorrect: since P(t = A) = P(t = B) 


to hypothesis t being true for te{A,B}, where 


= -—, itis: 
2 


1 
52.<ig0 POR UPR) 


ae P(h"|t) 
The expected misperception after n submissions is defined as a probability- 


weighted sum of misperceptions over all possible histories of length n that may 
occur: 


E[misperception] = : [1 —P(z|h")|P(h" |r) (2) 
T= AB href ABQ)" 


Note that these calculations are done for an underlying value of f. 

Focusing on h' (for which we need two scientists), there are three possible 
histories, namely h'e{ A,B, @}. Equation (2), above, which gives us the expected 
misperception, will have 6 terms when n = 1, because each of the three histories 
can occur from either hypothesis te{A,B}. Note that P(z|h') is symmetric in the 
sense that its value remains the same when A and B (as values of t and elements of 
h') are permuted. A consequence of this symmetry is that we only need to consider 
the case when one hypothesis (for example, A) is correct, and the sum of 6 terms 
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will be equal to twice of the sum of the three items relevant for t = A. For n = 2 
because there are 3” = 9 possible histories, there will be 9 terms to calculate (after 
taking into account the symmetry). Similarly, the expected misperception after n 
submissions can be obtained by calculating 3” terms: 


[1 —P(A|h")]P(h"|A) (3) 


E[misperception|= >> 
h'e{A,B,O}" 
Herding is defined for scientists who are submitting papers. A scientist, say j, is 
said to be herding if they would choose the same theme to advocate regardless of 
their private signal as their posterior belief would attach a probability more than 
one half to a particular hypothesis regardless of their own signal, that is, if: 


For some te{A,B}, min{ P(1|B,h/~?,t—1,s; =A) ,P(t|B./~?,4)— 1.5; =B) 5 (4) 


The probability of herding, for a scientist j, can easily be calculated by the following 
probability-weighted sum: 


Probability of herding= > 
{Vita} 


1p (W-?,5-1) PW? t)-1) (5) 


where P(h, t-1)is the probability that (hi, t-1) realizes from either hypo- 
thesis and 1,, is the indicator function that assumes a value of 1 if (4) holds, and 0 
otherwise. 

When herding occurs, some histories and information profiles will occur with 
probability zero. This means that there will generally be a number of terms in (3) 
and (5) that will never occur, so the calculations required will generally be over a 
smaller number of terms than the theoretical upper bound. Nevertheless, the large 
number of terms that result from even a moderate n are impossible to simplify to 
obtain a closed-form algebraic expression for either the expected misperception or 
the probability of herding. We therefore wrote a computer program to numerically 
calculate the algebraic expressions within available computing power. 
Computer program. The program (code provided in the Supplementary 
Information) worked by building and evaluating the algebraic formulae to obtain 
results that are accurate up to the level of precision the computer used in its 
calculations (52 dp), as explained through a number of key steps described below 
for various values of £. The information a reviewer j has, (hi ~? G-155))e{A, 
B, OV? x {A,B}’, is referred to as their ‘information profile’. 

Step 1: For each of the two possible private signals of scientist 1, s;¢{A,B}, a 
probability is set for the occurrence of that signal conditional on each of the two 
hypothesis te{A,B}: P(s,|t) =f if s; =t and P(s,|t) =1— otherwise. Thus, the 

; aon P(s;|t) 
posterior on the true hypothesis is calculated as: P(t|s,) PsA) + PCB) 

Step 2: For each signal s a submission decision of scientist 1 is prescribed. As 
P(t=s,|s,) >0.5, for scientist 1 the theme of their submitted paper (t,) will be 
identical to their signal (s,). This determines the probability of t;¢{A,B} con- 
ditional on te{A,B}. 

Step 3: For each possible information profile (t),5.)€{A,B}” of scientist 2, the 
probability of acceptance (of scientist 1’s submission with theme f ) is determined 
in accordance with the adopted model. For M1 (and hence, M2), this involves 

P(t),52|T) 
P(t 552 |A) + P(ty,52|B) 


calculating scientist 2’s posterior beliefs as P(t|t),s2) 
where P(t,s2|t) = P(t)|t)P(s2|t). 

Step 4: If scientist 1’s manuscript is rejected, a history h’ = @ ensues. Ifaccepted, 
a history h' = t; ensues. For each possible history h', the conditional probability 
P(h'|t) is obtained by aggregating the probabilities that it arises from different 
signal profiles (s1, s2) conditional on t. The misperception is calculated for each 
history according to the formula (1), and then the expected misperception accord- 
ing to the formula (3). 

Step 5: The submission decision of scientist 2, t,, is equal to t such that 
P(t|t,s2) >0.5 if such a t exists; otherwise, that is, if P(A|t1,s2) = P(Blti,s2) =0.5, 
then tf =s). This determines the conditional probability P(h',t,|t). Herding (and 
other results) is calculated according to the relevant formulae given. 


Step 6: Steps 3-5 are repeated for je{3, ...n +1} for every possible information 
profile (Wi 2 h- 1 45) of scientist j with the following modifications: scientist 
P(W~?,t)1,5) t) 
P(hi-2,t;_1,5)|A) +P(W-?,t)-1,5) B) 
where P(W~?,t;_1,5;t) =P(W~?,t)_1|t) P(s;|t) in step 3; W—'ehi—? x {A,B,.O} 
replaces h' and P(h/~ |r) is obtained by combining P(h/~?,t;_1,s;|t) and scientist 
j sacceptance probability given their information profile (Wi ~ 7 h-1 7) in step 4; and 
P(t|W~?,t;_1,5;) and P(W—! ,t|t) replace P(t|t,s.) and P(h',t2|7), respectively, in 
step 5. 
Analytical results on asymptotic properties. Analytic comparison of different 
models is obtained asymptotically as the numbers of scientists tends to infinity. 
Consider M1. Let H" = {A,B,@}” denote the set of all possible histories of length 
n, and h"cH" denote a history in H". Then, F, ={@} UH! U~ UH?" for n= 
1,2,--+, constitute an infinite sequence of o-fields on H”. 

For each h”, let P(h") be the ex ante probability that h” will realize from either 
P(h"|A) 
P(h"|A) + PC" |B) 
belief that t=A after h". Then, X,, is a random variable defined on (H” ,F,,,P), 
and {(Xn.Fn) }y—1,2,.. constitutes a martingale. Let Q(h") = P(h"|A). Then, with X,, 
defined on (H® ,F,,,Q), the sequence {(X;,,F,) },,— 1... constitutes a submartingale. 
By the Martingale Convergence Theorem”, E(X,,) > E(X) almost surely where X isa 
random variable such that X, +X with probability 1 and E(-) is taken relative to Q. 

Consider a history h” with the corresponding posterior X,, =x <1. Then, there 
are three possible continuation histories of length n + 1: h” followed by A, B, or @. 
As the manuscript of scientist n + 1 is accepted with a probability that is strictly 
between 0 and 1, (i) at least two of the three possible continuation histories realize 
with a strictly positive probability. Furthermore, (ii) the posteriors after these 
continuation histories differ, (iii) they depend on x but not on 2, (iv) the distri- 
bution over these posteriors conditional on t = A first-order stochastically dom- 
inates that conditional on t = B. Hence, E(Xy+1|Xn =x) —x is a strictly positive 
constant that depends on x but not on n, and consequently, E(X) < 1 is not viable. 
As E(X) = 1, therefore, we conclude that E(X) = 1, that is, the posterior converges 
to true state with probability 1 when t =A. As a symmetric argument applies to the 
case that Q(h”) = P(h"|B), that is, when t = B, the misperception converges to 0 as 
n— © under M1. 

Next, consider the generalized M1 model with v > 0. As long as 0 < v <1, itis 
straightforward to verify that the deductions (i)-(iv) hold and, consequently, the 
same argument as above leads to the same conclusion that the misperception 
converges to 0 as n > ~. If vy > 1, on the other hand, any manuscript on theme 
t will be accepted with certainty once the posterior belief for the theme being true 
exceeds a certain threshold level which is strictly below 1. In addition, the scientists 
will submit on the popular theme regardless their own signal if the posterior for 
that theme exceeds a (different) threshold. Therefore, if the posterior belief for 
t=A gets sufficiently close to 1 or 0, both the author’s theme selection and the 
reviewer's decision are uniquely determined by the prevailing posterior indepen- 
dently of the scientist’s own signal. Once this stage is reached, then the continua- 
tion history is uniquely determined (irrespective of whether t = A or B) unlike (i) 
above and, consequently, publication outcomes reveal no further information and 
the posterior remains at the same level forever. Therefore, the expected mispercep- 
tion never converges to 0 and remains fixed at some positive level within finite time 
with probability 1. 

For M2 and M3, by the same token the expected misperception never converges 
to 0 and gets stuck at some positive level once the posterior belief reaches a level 
such that the author’s theme selection is dictated by herding independently of their 
own signal. 


7's posterior beliefs are P(t|h/~*,t;_1,5)) 


t=A,B. Let X,(h")= denote the Bayes-updated posterior 
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Sequence variants in SLCI6AI1] are a common risk 
factor for type 2 diabetes in Mexico 


The SIGMA Type 2 Diabetes Consortium* 


Performing genetic studies in multiple human populations can identify 
disease risk alleles that are common in one population but rare in 
others’, with the potential to illuminate pathophysiology, health dis- 
parities, and the population genetic origins of disease alleles. Here 
we analysed 9.2 million single nucleotide polymorphisms (SNPs) in 
each of 8,214 Mexicans and other Latin Americans: 3,848 with type 2 
diabetes and 4,366 non-diabetic controls. In addition to replicating 
previous findings” *, we identified a novel locus associated with type 
2 diabetes at genome-wide significance spanning the solute carriers 
SLCI6A11 and SLC16A13 (P= 3.9 X 10” °; odds ratio (OR) = 1.29). 
The association was stronger in younger, leaner people with type 2 
diabetes, and replicated in independent samples (P= 1.1 x 10~*; 
OR = 1.20). The risk haplotype carries four amino acid substitutions, 
allin SLC16A11; itis present at ~50% frequency in Native American 
samples and ~10% in east Asian, but is rare in European and African 
samples. Analysis of an archaic genome sequence indicated that the 
risk haplotype introgressed into modern humans via admixture with 
Neanderthals. The SLC16A11 messenger RNA is expressed in liver, 
and V5-tagged SLC16A11 protein localizes to the endoplasmic reticu- 
lum. Expression of SLC16A11 in heterologous cells alters lipid meta- 
bolism, most notably causing an increase in intracellular triacylglycerol 
levels. Despite type 2 diabetes having been well studied by genome- 
wide association studies in other populations, analysis in Mexican 
and Latin American individuals identified SLC16A11 as a novel can- 
didate gene for type 2 diabetes with a possible role in triacylglycerol 
metabolism. 

The Slim Initiative in Genomic Medicine for the Americas (SIGMA) 
Type 2 Diabetes Consortium set out to characterize the genetic basis of 
type 2 diabetes in Mexican and other Latin American populations, where 
the prevalence is roughly twice that of US non-Hispanic whites” (see 
also http://www.cdc.gov/diabetes/pubs/factsheet11.htm). This report 
considers 3,848 type 2 diabetes cases and 4,366 controls (Table 1) gen- 
otyped using the Illumina OMNI 2.5 array that were unrelated to other 
samples, and that fall on a cline of Native American and European 
ancestry® (Extended Data Fig. 1). Association analysis included 9.2 million 
variants that were imputed”* from the 1000 Genomes Project Phase I 
release’ based on 1.38 million SNPs directly genotyped at high quality 
with minor allele frequency (MAF) >1%. 


The association of SNP genotype with type 2 diabetes was evaluated 
using LTSOFT"®, a method that increases power by jointly modelling 
case-control status with non-genetic risk factors. Our analysis used body 
mass index (BMI) and age to construct liability scores and also included 
adjustment for sex and ancestry via principal components®. The quan- 
tile-quantile (QQ) plot is well calibrated under the null (Agc = 1.05; 
Fig. 1a, red), indicating adequate control for confounders, with sub- 
stantial excess signal at P< 10 *. 

We first examined SNPs previously reported to be associated to risk of 
type 2 diabetes. Two such variants reached genome-wide significance: 
TCF7L2 (rs7903146; P = 2.5 X 1071”; OR = 1.41 (95% confidence inter- 
val 1.30-1.53)) and KCNQ]1 (rs2237897; P = 4.9 X 10° '°; OR= 0.74 
(0.69-0.80)) (Extended Data Figs 2, 3a), with effect sizes and frequen- 
cies consistent with previous studies**!’. At KCNQ1, we identified a 
signal’ of association that shows limited linkage disequilibrium both to 
182237897 (7° = 0.056) and to rs231362 (17 = 0.028) (previously seen 
in Europeans"’), suggesting a third allele at this locus (rs139647931; 
after conditioning, P = 5.3 X 10 °; OR=0.78 (0.70-0.86); Extended 
Data Fig. 3b and Supplementary Note). 

More generally, of SNPs previously associated with type 2 diabetes 
at genome-wide significance, 56 of 68 are directionally consistent with 
the initial report (P = 3.1 X 107°; Supplementary Table 1). Nonetheless, 
a QQ plot excluding all SNPs within 1 megabase (Mb) of the 68 type 2 
diabetes associations remains strikingly non-null (Fig. 1a, blue). 

This excess signal of association is entirely attributable to two regions 
of the genome: chromosome 11p15.5 and 17p13.1 (Fig. 1a, black). The 
genome-wide significant association at 11p15.5 spans insulin, IGF2 
and other genes (Extended Data Fig. 3a): the SNP with the strongest 
association lies in the 3’ untranslated region (UTR) of IGF2 and the 
non-coding INS-IGF2 transcript (rs11564732, P = 2.6 X 10 OR=0.77 
(0.70-0.84); Supplementary Table 2). The associated SNPs are ~700 kilo- 
bases (kb) from the genome-wide significant signal in KCNQI (above), 
and analysis conditional on the two significant KCNQ1 SNPs reduced 
the INS-IGF2 association signal to just below genome-wide significance 
(P=7.5 X 10 ’, Extended Data Fig. 3c). Conditioning on the two KCNQ1 
SNPs and the INS-IGF2 SNP reduces the signal to background (Extended 
Data Fig. 3d). Further analysis is needed to determine whether the INS- 
IGF2 signal is reproducible and independent of that at KCNQI. 


Table 1 | Study cohorts comprising the SIGMA type 2 diabetes project data set 


Study Sample location Study design n (before Per cent Age (years) Age-of-onset BMI (kgm~?) Fasting plasma 
quality control) male (years) glucose 
(mmol |~+) 

UNAM/INCMNSZ Mexico City, Prospective Controls 1,138(1,195) 41.1 55.3+9.4 - 28.1+4.0 48+0.5 
Diabetes Study (UIDS) Mexico cohort T2D cases 815 (872) 40.9 §6.22123 442+113 284+45 - 
Diabetes in Mexico Mexico City, Prospective Controls 472 (505) 25.8 62.5277 - 28.0 + 4.4 5.0+0.4 
Study (DMS) Mexico cohort T2D cases 690 (762) 33.0 55.8+11.1 478+106 29.0+54 - 

exico City Diabetes Mexico City, Prospective Controls 613 (790) 39.3 62.52+7.7 - 294+48 50+0.5 
Study (MCDS) Mexico cohort T2D cases 287 (358) 41.1 64.2+7.5 55.1 +9.7 29.9+5.4 - 

ultiethnic Cohort Los Angeles, Case-control Controls 2,143 (2,464) 48.3 59.3+7.0 - 26.6 + 3.9 N/A 
(MEC) California, USA T2D cases 2,056 (2,279) 479 59.2+69 N/A 30.00+54 - 
The table shows sample location, study design, numbers of cases and controls (including numbers before quality control checks), per cent male participants, age + standard deviation (s.d.), age-of-onset in 


cases + s.d., body mass index + s.d., and fasting plasma glucose in controls + s.d. N/A, not applicable; T2D, type 2 diabetes. 


*Lists of participants and their affiliations appear at the end of the paper. 
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Figure 1 | Identification of a novel type 2 diabetes risk haplotype carrying 
5 SNPs in SLCI6A11. a, QQ plot of association statistics in genome-wide scan 
of n = 8,214 samples shows calibration under the null and enrichment in the 
tail for all SNPs (red), and after removing SNPs within 1 Mb of previously 
published type 2 diabetes associations (blue). Removal of sites within 1 Mb of 68 
known loci and two novel loci results in a null distribution (black). Association 
with liability threshold quantitative traits tested via linear regression. T2D, 
type 2 diabetes. b, Regional plot of association at 17p13.1 that spans SLC16A11 
and SLC16A13. c, Analysis conditional on genotype at rs13342232 (the top 
associated variant) reduces signal to far below genome-wide significance 
across the surrounding region. Colour indicates r° to the most strongly 
associated site; recombination rate is shown, each based on the 1000 Genomes 
ASN population. d, Graphical depictions of SLC16A11 haplotypes constructed 
from the synonymous and four missense SNPs associated to type 2 diabetes, 
with haplotype frequencies derived from the 1000 Genomes Project and 
SIGMA samples. AFR, African (n = 185); ASN, east Asian (n = 286); EUR, 
European (n = 379); MXL, Mexican samples from Los Angeles (n = 66). 


The strongest novel association is at 17p13.1 spanning SLCI6A11 
and SLCI6A13 (Fig. 1b), both poorly characterized members of the 
monocarboxylic acid transporter family of solute carriers’. The strongest 
signal of association includes a silent mutation as well as four missense 
SNPs, all in SLCI6A11 (Fig. 1d, e). These five variants are (1) in strong 
linkage disequilibrium (r’ = 0.85 in 1000 Genomes samples from the 
Americas) and co-segregate on a single haplotype; (2) common in samples 
of Latin American ancestry; and (3) show equivalent levels of asso- 
ciation to type 2 diabetes (P = 2.4 x 10 *toP =3.9X 10 °;OR= 1.29 
(1.20-1.38); Supplementary Tables 3-5). Analysis conditional on any 
of these variants leaves no genome-wide significant signal (Fig. 1c and 
Extended Data Fig. 4). Computational prediction with SIFT’? (which 
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Frequencies from SIGMA samples are calculated from genotypes and 
represent either the entire data set (All) or only samples estimated to 

have =95% Native American ancestry (=95 NA, n = 290; Supplementary 
Methods). Haplotypes with population frequency <1% are not depicted. 

e, Predicted membrane topology of human SLC16A11 generated using 
TMHMM 2.0 and visualized with TeXtopo. Locations of SNPs carried by the 
type-2-diabetes-associated haplotype are indicated. f, Forest plot depicting 
odds ratio estimates at rs75493593 from the four SIGMA cohorts, the SIGMA 
pooled mega-analysis, the replication cohorts, replication-only meta-analysis 
based on inverse standard error weighting of effect sizes, and the overall 
meta-analysis (including all replication cohorts and the SIGMA mega- 
analysis). Accompanying table lists ethnicity, cohort names, estimated odds 
ratio (OR) and 95% confidence interval (95% CI). Replication cohorts are 
the Type 2 Diabetes Genetic Exploration by Next-generation sequencing in 
multi-Ethnic Samples (T2D-GENES), Multiethnic Cohort (MEC), and 
Singapore Chinese Health Study (SCHS). Further details including sample 
sizes are provided in Supplementary Table 8. 


considers each site independently) labels one of the missense SNPs 
(rs13342692, D127G) as damaging and the other three ‘tolerated’ (Sup- 
plementary Table 6). 

Individuals that carry the risk haplotype develop type 2 diabetes 
2.1 years earlier (P = 3.1 X 10 *), and at 0.9 kg m ~* lower BMI (P= 
5.2 X 10 “) than non-carriers (Extended Data Fig. 5). The odds ratio 
for the risk haplotype estimated using young cases (=45 years) was higher 
than in older cases (OR = 1.48 versus 1.11; Pheterogeneity = 1.7 X 10°). 
We tested the haplotype for association with related metabolic quant- 
itative traits in the fasting state in a subset of SIGMA participants (n = 
1,505-3,855). No associations surpass nominal significance (P < 0.05; 
Supplementary Table 7). 
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Given that large genome-wide association studies (GWAS) have 
been performed for type 2 diabetes in samples of European and Asian 
ancestry, it may seem surprising that associated variants at SLC16A 11/13 
were not previously identified. Using data generated by the 1000 Genomes 
Project and the current study, we observed that the risk haplotype 
(hereafter referred to as ‘5 SNP’ haplotype) is rare or absent in samples 
from Europe and Africa, has intermediate frequency (~10%) in sam- 
ples from east Asia, and up to ~50% frequency in samples from the 
Americas (Fig. 1d and Extended Data Fig. 6a). A second haplotype 
carrying one of the four missense SNPs (D127G) and the synonymous 
variant (termed the ‘2 SNP’ haplotype) is very common in samples 
from Africa but rare elsewhere, including in the Americas (Fig. 1d). 
The low frequency of the 5 SNP haplotype in Africa and Europe may 
explain why this association was not found in previous studies. 

Weattempted to replicate this association in ~22,000 samples from a 
variety of ancestry groups. A proxy for the 5 SNP haplotype of SLC16A11 
showed strong association with type 2 diabetes (Preptication = 1.1 X 104; 
OReplication = 1.20 (1.09-1.31); Poombined = 5-4 X 107°; ORcombined = 1.25 
(1.18-1.32); Fig. 1f and Supplementary Table 8). The association was 
clearly observed in east Asian samples, a population that lacks admix- 
ture of Native American and European populations and shows little 
genetic substructure. This result argues against population stratifica- 
tion as an explanation for the finding in Latin American populations. 

Weestimated the difference in disease prevalence attributable to a risk 
factor with OR = 1.20 (1.09-1.31), 26% frequency in Mexican Americans 
(as in the SIGMA control samples) and 2% in European Americans. 
Approximately 20% (9.2-29%) of the difference in prevalence could be 
explained by such a risk factor (Supplementary Methods). 

Two population genetic features of the 5 SNP haplotype struck us as 
discordant. The haplotype sequence is highly divergent, with an estimated 
time to most recent common ancestor (TMRCA) of 799,000 years to a 
European haplotype (Supplementary Table 9 and Supplementary Note). 
This long precedes the ‘out of Africa’ bottleneck. And yet, the haplo- 
type is not observed in Africa and is rare throughout Europe (Fig. 1d). 

This combination of age and geographical distribution could be con- 
sistent with admixture from Neanderthals into modern humans. Neither 
the published Neanderthal genome” nor the Denisova genome’* con- 
tained the variants observed on the 5 SNP haplotype. However, an unpub- 
lished genome of a Neanderthal from Denisova Cave'*”’ is homozygous 
across 5 kb for the 5 SNP haplotype at SLC16A11, including all four 
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Figure 2 | SLC16A11 localizes to the endoplasmic reticulum and alters lipid 
metabolism in HeLa cells. a, Localization of SLC16A11 to the endoplasmic 
reticulum. HeLa cells expressing C terminus, V5-tagged SLC16A11 were 
immunostained for SLC16 expression (anti-V5) along with markers for the 
endoplasmic reticulum (anti-calnexin), cis-Golgi apparatus (anti-Golph4), or 
mitochondria (MitoTracker). Imaging of each protein was optimized for clarity 
of localization rather than comparison of expression level across proteins. 
Representative images from multiple independent transfections are shown. 

b, Changes in intracellular lipid metabolites after expression of SLC16A11-V5 
in HeLa cells. The fold change in cells expressing SLC16A11 relative to cells 
expressing control proteins is plotted for individual lipid metabolites, with lipid 
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missense SNPs. Over a span of 73 kb this Neanderthal sequence is nearly 
identical to that of individuals from the 1000 Genomes Project who are 
homozygous for the 5 SNP haplotype (Supplementary Note). 

Two lines of evidence indicate that the 5 SNP haplotype entered 
modern humans through archaic admixture. First, the Neanderthal 
sequence is more closely related to the extended 73 kb 5 SNP haplotype 
than to random non-risk haplotypes (mean TMRCA = 250,000 years 
versus 677,000 years; Supplementary Tables 10 and 11 and Supplemen- 
tary Note), forming a clade with the risk haplotype (Extended Data 
Fig. 6b) with a coalescence time that post-dates the range of estimated 
split times between modern humans and Neanderthals’*'’. Second, the 
genetic length of the 73-kb haplotype is longer than would be expected 
if it had undergone recombination for ~9,000 generations since the 
split with Neanderthals (P = 3.9 X 10°; Supplementary Note). These 
two features indicate that the 5 SNP haplotype is not only similar to the 
Neanderthal sequence, but was probably introduced into modern 
humans relatively recently through archaic admixture. We note that 
whereas this particular Neanderthal-derived haplotype is common in 
the Americas, Latin Americans have the same proportion of Neanderthal 
ancestry genome-wide as other Eurasian populations (~2%)"». 

With an absence of multiple independently segregating functional 
mutations in the same gene, we lack formal genetic proof that SLC16A11 
is the gene responsible for association to type 2 diabetes at 17p13.1. 
Nonetheless, as the associated haplotype encodes four missense SNPs 
in a single gene (Supplementary Table 12), we set out to begin char- 
acterizing the function of SLC16A11. 

We examined the tissue distribution of SLC16A11 mRNA expression 
using Nanostring and ~55,000 curated microarray samples. In both 
data sets, we observed SLC16A11 expression in liver, salivary gland and 
thyroid (Extended Data Figs 7 and 8). We used immunofluorescence to 
determine the subcellular localization of V5-tagged SLC16A11 intro- 
duced into HeLa cells. SLC16A11-V5 co-localizes with the endoplas- 
mic reticulum membrane protein calnexin, but shows minimal overlap 
with plasma membrane, Golgi apparatus and mitochondria (Fig. 2a). 
Distinct patterns were seen for other SLC16 family members, which are 
known to have diverse cellular functions’: SLC16A13-V5 localizes to the 
Golgi apparatus and SLC16A1-V5 appears at the plasma membrane” 
(Extended Data Fig. 9 and data not shown). 

As SLC16 family members are solute carriers, we expressed SLC16A11 
(or control proteins) in HeLa cells (which do not express SLC16A11 at 
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classes indicated by point colour and P values (of the Wilcoxon rank-sum test) 
by point size. c, Fold change plotted for both polar and lipid metabolites, 
grouped according to metabolic pathway or class. Pathways shown include all 
KEGG pathways from the human reference set for which metabolites were 
measured as well as eight additional classes of metabolites covering carnitines 
and lipid subtypes. Each point within a pathway or class shows the fold change 
of a single metabolite within that pathway or class. Pathway names and 
statistical analyses are shown in Extended Data Fig. 10 and Supplementary 
Table 14. Metabolite data shown are the combined results from three 
independent experiments, each of which included 12 biological replicates each 
for SLC16A11 and control. 
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appreciable levels) and profiled ~300 polar and lipid metabolites. Expres- 
sion of SLC16A11 resulted in substantial increases in triacylglycerol 
(TAG) levels (P = 7.6 X 10° |”), with smaller increases in intracellular 
diacylglycerols (P = 7.8 X 10 *) and decreases in lysophosphatidyl- 
choline (P = 2.0 X 107°), cholesterol ester (P = 9.8 X 10° *) and sphin- 
gomyelin (P = 3.9 X 10. *) lipids (Fig. 2b, cand Supplementary Tables 
13 and 14). As TAG synthesis takes place in the endoplasmic reticulum 
in the liver’', these results indicate that SLC16A11 may have a role in 
hepatic lipid metabolism. We note that serum levels of specific TAGs 
have been prospectively associated with future risk of type 2 diabetes” 
and accumulation of intracellular lipids has been implicated in insulin 
resistance in human populations”**. 

In summary, GWAS in Mexican and other Latin American samples 
identified a haplotype containing four missense SNPs, all in SLC16A11, 
that is much more common in individuals with Native American ances- 
try than in other populations. Each haplotype copy is associated with 
a ~20% increased risk of type 2 diabetes. With these properties, the 
haplotype would be expected to contribute to the higher burden of 
type 2 diabetes in Mexican and Latin American populations”. The 
haplotype derives from Neanderthal introgression, providing an example 
of Neanderthal admixture affecting physiology and disease susceptibility 
today. Our data suggest the hypothesis for future studies that SLCI6A11 
may influence diabetes risk through effects on lipid metabolism in the 
liver. Our results also indicate that genetic mapping in understudied 
populations can identify previously undiscovered aspects of disease 
pathophysiology’. 

Note added in proof: While this paper was in final revision, Hara 
et al. reported” a SNP in SLC16A 13 (rs312457) as associated with risk 
of T2D in an east Asian population with OR = 1.20, P= 107 '”. 


METHODS SUMMARY 


DNA samples were prepared using strict quality control procedures and genotyped 
using the Illumina HumanOmni2.5 array. Stringent sample and SNP quality (includ- 
ing ancestry) filters were applied on the resulting genotypes. After imputation”, 
SNPs were quality filtered (MAF =1% and info score =0.6) and association testing 
was performed via LTSOFT” with type 2 diabetes status, BMI, and age modelling 
liability and adjusting for sex and top two principal components as fixed effect 
covariates. P values were corrected for genomic control (Agc = 1.046). Odds ratios 
(ORs) are from logistic regression in PLINK”* using BMI, age, sex, and top 2 prin- 
cipal components as covariates. Proportion of Native American ancestry was 
estimated using ADMIXTURE” (K = 3) run including unadmixed individuals from 
several populations. 

Odds ratios for young (=45 years) and older age of onset cases were calculated 
using logistic regression in each group compared to two randomly selected non- 
overlapping sets of controls. Significance testing used a Z-score calculated from 
these odds ratios. 

Population prevalence was modelled using odds ratio to approximate relative 
risk in a log-additive effect model’*. Relative change in population prevalences is 
reported based on removing a locus with relative risk of 1.20 and the indicated 
frequency. 

Gene expression analyses were performed on data collected using Nanostring 
and a compendium of publicly available Affymetrix U133 Plus 2.0 microarrays. 
The subcellular localization of SLC16A11-V5 and metabolic profiling studies were 
performed after expression of carboxy-terminus, V5-tagged SLC16A11 in HeLa 
cells. Metabolite values were normalized to the total metabolite signal obtained for 
each sample. Measurements were obtained in replicate from each of three inde- 
pendent experiments, with data combined after subtracting the mean of the log- 
transformed values. The Wilcoxon rank sum test was used to test for differences in 
individual metabolite levels in cells expressing SLC16A11 compared to controls; 
the Wilcoxon signed rank test was used to assess differences in lipid classes. 


Online Content Any additional Methods, Extended Data display items and Source 


Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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Many secretory proteins are targeted by signal sequences to a protein- 
conducting channel, formed by prokaryotic SecY or eukaryotic Sec61 
complexes, and are translocated across the membrane during their 
synthesis’”. Crystal structures of the inactive channel show that the 
SecY subunit of the heterotrimeric complex consists of two halves 
that form an hourglass-shaped pore with a constriction in the middle 
of the membrane and a lateral gate that faces the lipid phase**. The 
closed channel has an empty cytoplasmic funnel and an extracellular 
funnel that is filled with a small helical domain, called the plug. During 
initiation of translocation, a ribosome-nascent chain complex binds 
to the SecY (or Sec61) complex, resulting in insertion of the nascent 
chain. However, the mechanism of channel opening during trans- 
location is unclear. Here we have addressed this question by deter- 
mining structures of inactive and active ribosome-channel complexes 
with cryo-electron microscopy. Non-translating ribosome-SecY 
channel complexes derived from Methanocaldococcus jannaschii 
or Escherichia coli show the channel in its closed state, and indicate 
that ribosome binding per se causes only minor changes. The struc- 
ture of an active E. coli ribosome-channel complex demonstrates 
that the nascent chain opens the channel, causing mostly rigid body 
movements of the amino- and carboxy-terminal halves of SecY. In 
this early translocation intermediate, the polypeptide inserts as a 
loop into the SecY channel with the hydrophobic signal sequence 
intercalated into the open lateral gate. The nascent chain also forms 
a loop on the cytoplasmic surface of SecY rather than entering the 
channel directly. 

Opening of the SecY channel during initiation of translocation involves 
two events: binding of the ribosome and insertion of the nascent chain. 
To analyse how ribosome binding per se affects the structure ofa trans- 
location channel, we first determined the structure of complexes lack- 
ing a nascent chain. Initial experiments were performed with complexes 
from M. jannaschii, because this allows a direct comparison with a crystal 
structure of SecY*. Purified M. jannaschii ribosomes were incubated 
with an excess of SecY complex, and complexes were imaged by cryo- 
electron microscopy. A total of ~37,000 particles were analysed, resul- 
ting in an electron density map with a resolution of 9.0 A for the ribosome 
and ~12.7 A for the channel (Supplementary Table 1). 

A ribosome model from Pyrococcus furiosus’, a species related to 
M. jannaschii, was fit into the density map, allowing the identification 
of essentially all RNA helices and many helical features of ribosomal 
proteins (Fig. la and Supplementary Fig. 1). A crystal structure of the 
M. jannaschii SecY complex could be docked into density for the SecY 
channel (Fig. 1b and Supplementary Fig. 2), and molecular dynamics 
flexible fitting (MDFF)’ resulted in only small changes (Fig. 1c). All 
transmembrane segments (TMs), including the 10 TMs of SecY, and 
the single TMs of the SecE and Sec subunits, could be accounted for in 
the map. Several TM helices and the extracellular loop between TMs 5 
and 6 were partially resolved (Supplementary Fig. 3). A comparison with 
the crystal structure shows that, with the exception of some adjustments 


in the cytoplasmic helix of SecE, membrane-embedded domains remained 
essentially unaltered (Fig. 1c). As observed previously with other species*", 
loops between TMs 6 and 7 (6/7 loop) and TMs 8 and 9 (8/9 loop) of 
SecY, as well as the cytoplasmic helix of SecE (Fig. 1b), all interact with 
components of the large ribosomal subunit at the tunnel exit (Sup- 
plementary Fig. 4a—c). These interactions do not induce major structural 
changes in the SecY channel and leave the lateral gate closed. 

Next we determined the structure of a non-translating ribosome- 
channel complex from E. coli, with a larger data set than used previously’. 
A total of ~39,000 particles were analysed, resulting in a density map 
with a resolution of ~9.5 A for the ribosome and ~ 14 A for the channel 
(Supplementary Table 1). Models for ribosomal subunits'”’* were docked 
into the density map (Fig. 1d) and all RNA helices were visible, as well 
as some partially resolved helices of ribosomal proteins (Supplemen- 
tary Fig. 5). Because there is no crystal structure of the E. coli SecY com- 
plex, we generated a homology model on the basis of crystal structures 
of Thermus thermophilus and Thermotoga maritima complexes*"? (Sup- 
plementary Figs 6 and 7). This model was subjected to MDFF using the 
entire density map of the ribosomal large subunit and channel as a 
restraint. This resulted in movements of cytoplasmic loops, whereas 
membrane-embedded domains remained essentially unchanged (Sup- 
plementary Fig. 8). Many features of the channel are clearly visible in a 
segmented map (Fig. le and Supplementary Figs 9 and 10), including 
cytoplasmic loops of SecY, two helices of SecE, two TMs of SecG (the 
bacterial equivalent of archaeal Sec) and some partially resolved TMs 
of SecY. Connections between the channel and ribosome were similar 
to those in the M. jannaschii complex, with the exception of the longer 
6/7 loop of SecY, which is repositioned between RNA helices 6 and 7 
(Supplementary Fig. 4d-f). Importantly, the ribosome alone does not 
induce major changes in the channel structure, so the lateral gate remains 
closed (Fig. 1f). 

To determine the structure of an active E. coli ribosome-channel 
complex, we used a new strategy. Previous attempts to obtain a structure 
of an active translocation channel showed that a translating ribosome 
was bound to the channel, but there was little biochemical evidence that 
a nascent chain was inserted in the channel and no clear electron density 
was visible for the polypeptide’®”’. These studies used small amounts of 
ribosome-nascent chain complexes (RNCs) that were formed in vitro 
and subsequently added to purified channels. To obtain a more physio- 
logical sample, we generated an early translocation intermediate of a 
secretory protein in living E. coli cells by expressing a polypeptide with 
100 amino acids from an inducible promoter'*”*. The polypeptide has 
an N-terminal signal sequence derived from DsbA, which targets the 
protein to the co-translational translocation pathway"®, and a C-terminal 
SecM-stalling sequence, which arrests translation of the ribosome”’ 
(Fig. 2a). We also expressed the endoribonuclease MazF from an indu- 
cible promoter to cleave messenger RNA between ribosomes, which 
results in the depletion of nascent chains associated with non-stalled 
ribosomes'*. To generate a stable complex between the SecM-stalled 
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Figure 1 | Structures of non-translating ribosome-channel complexes. 

a, Density map for the M. jannaschii complex. Models for ribosomal RNA and 
proteins of the small and large ribosomal subunits (ssu and |su; in gold and blue, 
respectively) and of the SecY complex (in red) were docked into the map. 

b, Fit of the M. jannaschii SecY complex into the segmented density map, as 
viewed from the cytoplasm (top view) and from the side. The N- and 


RNC and the channel, we used disulphide crosslinking. The nascent 
chain contained a cysteine at position 19 of the signal sequence, which 
can be crosslinked to a cysteine at position 68 in the SecY plug". Disul- 
phide bond formation was achieved by adding an oxidant to the E. coli 
culture, resulting in 70% of nascent chains being linked to SecY. 

To purify the RNC-channel complex, we replaced the endogenous 
ribosomal protein L12 with a Strep-tagged version, allowing the enrich- 
ment of ribosomes on a Strep-Tactin column. This purification step 
was performed at high salt concentration to remove SecY complexes 
lacking a nascent chain (Supplementary Fig. 11a). A second purifica- 
tion step exploited a His-tag inserted into a fusion between SecE and 
SecG, and allowed the enrichment of channel-containing complexes 
by Co? * -affinity chromatography. Finally, the sample was subjected to 
gel filtration. The purified RNC-channel complex eluted as a homo- 
geneous peak at the position of monosomes (Supplementary Fig. 11b). 
Ona Coomassie-stained SDS gel, the SecY—nascent chain-transfer RNA 
species was the only major band besides those from ribosomal proteins 
(Fig. 2b, lane 1). As expected, the band disappeared when the sample 
was treated with a reducing agent to remove the disulphide bridge or 
with RNase A to degrade the tRNA (Fig. 2b, lanes 2 and 3). We found 
that the previous protocol of adding purified RNCs to SecY complex, 
either in detergent or in nanodiscs’™"', resulted in inefficient insertion 
of the nascent chain into the channel (Supplementary Fig. 12). Also, 
when RNC-channel complexes were generated in vivo and crosslinked 
after purification, crosslinks between different nascent chain molecules 
and between the nascent chain and unidentified proteins were observed 
(Supplementary Fig. 13). Hence, crosslinking in vivo is required to main- 
tain the nascent chain in the channel. 

Purified RNC-channel complexes were frozen over holes on electron 
microscopy grids, as the channel was lost when complexes were placed 
on a carbon film. A total of ~167,000 individual particles were used, 
of which ~50% contained the channel. Additional sorting for the best 
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C-terminal halves of SecY are in light blue and red, respectively. SecE is in dark 
blue and Sec in brown. c, Comparison between the crystal structure of an 
M. jannaschii SecY complex (grey) and the electron microscopy structure 

(in colour), as viewed facing the lateral gate (front view). d, e, As ina and b, but 
for the E. coli complex. SecG, the bacterial equivalent of Sec, is in brown. 

f, A model for the E. coli channel in a front view. 


signal-to-noise ratio identified ~53,000 particles for structure deter- 
mination and resulted in a density map at ~10 A resolution for the 
ribosome and ~11 A for the channel (Fig. 3a and Supplementary Table 1). 
Ribosomal RNAs and proteins were clearly visible in the density map 
(Supplementary Fig. 14), along with aminoacyl (A-site) and peptidyl 
(P-site) tRNAs, as expected for a SecM-stalled ribosome’? (Sup- 
plementary Fig. 15a). Moreover, there was density for mRNA under- 
neath the anticodon regions of tRNAs (Supplementary Fig. 15b). We 
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Figure 2 | Purification of a RNC-channel complex. a, The complex was 
generated in living E. coli cells by expressing a nascent chain (NC) of 100 amino 
acids with a signal sequence and SecM-stalling sequence. The nascent chain 
also contains a Myc-tag. A cysteine at position 19 of the nascent chain (19C) 
was disulphide-crosslinked to a cysteine in the plug of SecY (68C). 

b, Coomassie-stained SDS-gel of the ribosome-NC (RNC)-channel complex 
(lane 1). The red arrow indicates the crosslinked product of SecY and the 
NC-tRNA adduct. This band disappears after treatment with 
B-mercaptoethanol (B-ME) or RNaseA (lanes 2 and 3). Ribosomal proteins 
(including $1) and the fusion between SecE and SecG are indicated. 
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also observed density for ribosomal protein S1 that was more extensive 
than seen before” (Fig. 3a and Supplementary Fig. 15c-e). 

To generate a model for the active channel, we created an E. coli 
homology model on the basis of a crystal structure of the SecY complex 
from P. furiosus’, which has the most open lateral gate among known 
crystal structures (Supplementary Fig. 16), and used MDFF to adjust 
the model to the experimental density map. The 6/7 loop and TM9 of 
SecY were well resolved (Fig. 3b), and ribosomal components interacting 
with the channel were the same as with the non-translating complex. 
The cytoplasmic helix of SecE and TM10 of SecY were clearly visible, 
and there was good density for SecG (Supplementary Figs 17 and 18). 
In addition, many TMs were partially resolved, with only occasional 
density breaks in the helices. Density for the nascent chain was clearly 
identifiable without segmentation of the density map. Specifically, addi- 
tional density for a helix was visible in the cytoplasmic part of the lateral 
gate (see below), explaining why a channel with a fully open lateral gate 
could be fit into the density map. In fact, the lateral gate is more open 
than in the P. furiosus crystal structure’ (Supplementary Table 2). 
Calculated cross correlation coefficients showed that the model for 
the open SecY channel is a better fit in the density map than the model 
for the closed channel (Supplementary Table 1). 

The modelled conformational change of the E. coli channel is sup- 
ported by the fact that the conversion from a closed to an open channel 
involves mostly rigid body movements of the N- and C-terminal halves 
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Figure 3 | Structure of the active SecY channel. a, Structure of the E. coli 
RNC-SecY channel complex, with large and small ribosomal subunits in blue 
and gold, respectively, the SecY complex in red, and ribosomal protein $1 in 
tan. b, Front (left) and side (right) views of the channel fit into the segmented 
density map (grey). The nascent chain was omitted for clarity. The N-terminal 
half of SecY is in light blue, the C-terminal half in red, SecE in dark blue and 
SecG in brown. c, Comparison of front views of the closed (left) and open 
(right) E. coli SecY channels, with the approximate position of the membrane 
indicated by solid horizontal lines. The N-terminal half of SecY is in light blue, 
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of SecY (Supplementary Fig. 19). To open the lateral gate, the N-terminal 
half of SecY undergoes a large rotation and tilt, whereas the C-terminal 
half moves less in the opposite direction (Fig. 3c; see also Supplemen- 
tary Video 1). SecE undergoes a tilting motion to accommodate move- 
ments of SecY, and SecG moves with the N-terminal half of SecY. These 
conformational changes would maintain the hydrophobic belt of the 
SecY complex within the lipid environment. In addition to rigid body 
movements, there are changes in the 5/6 loop that connect the two halves 
of SecY to accommodate the large opening motion. There are also move- 
ments in TM8 and the lower part of TM7. One particularly large change 
occurs in the upper part of TM8 (helix 8b), which is displaced towards 
the membrane surface (Fig. 3d). The 6/7 loop and TM9, as well as pre- 
ceding loop residues, including a conserved arginine (Arg 357), do not 
move appreciably (Fig. 3d), consistent with their role in tethering the 
channel to the ribosome. The plug domain moves only a small distance, 
probably because it is restrained by the disulphide bridge to the signal 
sequence. However, the plug does not have to move much to allow 
translocation’'. When viewed from the cytoplasmic side, these confor- 
mational changes open a pore adjacent to the lateral gate (Fig. 3e; see 
also Supplementary Video 2). Overall, the changes are more pronounced 
than seen previously'®”’. 

Density for the nascent chain was seen inside the ribosomal tunnel, 
on the cytoplasmic surface of the SecY complex, inside the channel, 
and on its periplasmic side (Fig. 4a and Supplementary Fig. 20). On the 
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the C-terminal half in red, SecE in dark blue, SecG in brown, and the plug in 
yellow. Some movements during channel opening are indicated, such as the 
rotation and tilting of the N-terminal half of SecY, the tilting of SecE, and the 
movement of helix 8b. Labels for helices 2b and 7 are placed at the same position 
in the closed and open channel. Pore residues forming the constriction in the 
closed channel are indicated with grey balls and sticks. d, Connections of the 
ribosome with the 8/9 loop of SecY and the cytoplasmic helix of SecE in the 
closed and open channels (top and bottom panels, respectively). Note the large 
movement of helix 8b towards the membrane. e, As inc, but viewed from the top. 
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basis of biochemical data’, an approximate model for the nascent chain 
in the RNC-channel complex was built into the density. The last ~40 
amino acids are located inside the ribosome, as cysteines introduced 
into this segment are inaccessible to a bulky modification reagent. In 
addition, cysteines at positions 19-34 are most favoured to form a disul- 
phide bridge with a cysteine in the plug. Finally, the position of the end 
of the signal sequence in our structure is constrained by the disulphide 
crosslink between position 19 of the nascent chain and position 68 of 
the plug. 

The resulting model shows that the hydrophobic core of the signal 
sequence forms a helix in the lateral gate (residues 1-15) (Fig. 4b-d and 
Supplementary Fig. 21), consistent with crosslinking data obtained with 
the yeast Sec61 complex”. The signal sequence helix is contacted by 
TM2b, helix 8b and TM7 of SecY (Fig. 4b). In a lipid bilayer, much of 
the signal sequence, including parts that follow the hydrophobic region, 
would be exposed to the hydrocarbon chains of phospholipids, again in 
agreement with crosslinking experiments”. Additional density below 
and adjacent to the signal sequence helix can account for the other side 
of the nascent chain loop. The pore through which the mature region of 
the nascent chain would move into the extracellular funnel is not exactly 
in the centre of the channel, but the translocating polypeptide may still 
be surrounded by pore ring residues that form a constriction in the closed 
channel (Supplementary Video 2). Crosslinking to the nascent chain 
may restrain the plug, keeping it in the centre of the channel. However, 
there is still room for the nascent chain to form a loop in the pore. 

We modelled density on the cytoplasmic surface of the channel as a 
loop that extends parallel to the surface and towards the back of the 
channel (residues ~45-63) (Fig. 4a, e). This part of the nascent chain 
lies in a V-shaped groove, which is framed by the base of the 6/7 loop 
and TM10 of SecY (Supplementary Fig. 22 and Supplementary Video 
3). However, the nascent chain may adopt an alternative orientation 
with a loop that extends above the lateral gate (marked with an asterisk 
in Fig. 4a, d). The nascent chain may also slide up and down the axis of 
the channel to some extent, as there is density on the periplasmic side 
that is not fully accounted for in our model. 

In summary, our structures show that ribosome binding alone does 
not induce major changes in the SecY channel, although it may cause 
transient opening”. Rather, stable opening of the channel requires 
loop insertion of the nascent chain. As predicted*”’, the hydrophobic 
part of the signal sequence forms a helix that occupies the open lateral 
gate. The signal sequence would thus become part of the channel wall, 
thereby increasing the size of the pore through which the polypeptide 
moves across the membrane. At later stages of translocation, the signal 
sequence is cleaved from the nascent chain and released from the lateral 
gate, which may result in a narrower pore. It is also possible that the 
signal sequence leaves the lateral gate before cleavage. This hypothesis 
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Figure 4 | Path of the nascent chain. a, Density 
(in light gold) and model (green line) for the 
nascent chain in the RNC-channel complex. The 
P-site tRNA is in brown, the ribosome in grey, and 
the channel in blue. The top right panel shows the 
entire RNC-channel complex from the same 
viewing angle. The bottom right panel shows the 
density and model for the nascent chain, with 
ribosome and channel omitted. The asterisk 
indicates density for an alternative orientation of 
the nascent chain loop on the cytoplasmic side 

of the channel (see also d). b, Side view of the signal 
sequence (ss) helix in the lateral gate. Density 

for the nascent chain on the cytoplasmic surface 
was removed for clarity. c, As in b, but viewed from 
the top along the axis of the signal sequence helix. 
d, As in c, but from a slightly different angle of 
view with nascent chain density on the cytoplasmic 
surface included. e, As in d, but without the 
density map. 


would be consistent with a two-dimensional crystal structure of the SecY 
complex that showed a synthetic signal peptide bound to the outside of 
an essentially closed channel”. 

Our results also indicate that most nascent chains form a loop on the 
cytoplasmic surface of SecY, rather than adopting a fully extended con- 
formation between the ribosome and channel. Although the observed 
looping of the nascent chain at the cytoplasmic surface of the channel 
needs to be confirmed with other substrates, it seems possible that a 
pulling force or ratcheting mechanism**” may be required to achieve 
efficient translocation. SecDF could use a proton gradient across the 
membrane together with movements of a periplasmic domain to pull 
on the nascent chain”. In addition, polypeptide chain folding or the 
binding of periplasmic chaperones may help to move the polypeptide 
chain across the membrane. 


METHODS SUMMARY 


The purification of E. coli 70S ribosomes and M. jannaschii and E. coli SecY com- 
plexes were each described previously**. M. jannaschii 70S ribosomes were puri- 
fied by sucrose gradient centrifugation, dissociated into 50S and 30S subunits, and 
re-associated. Non-translocating ribosome-SecY complexes were reconstituted by 
mixing ribosomes with a five- to eightfold molar excess of the SecY channels in 
n-dodecyl-B-p-maltoside (DDM). An RNC-SecY complex was generated in E. coli 
cells by expressing a SecM-stalled nascent chain under the arabinose promoter”. 
In addition, MazF endoribonuclease'* was expressed from a Tet promoter. After 
forming a disulphide bridge between the nascent chain and SecY by addition of 
5,5’ -dithiobis-(2-nitrobenzoic acid), RNC-SecY complexes were solubilized in DDM, 
and purified by tandem affinity chromatography using a Strep-tag on the ribo- 
somal protein L12 and a His-tag on a fusion of SecE and SecG. Complexes were 
further purified by size-exclusion chromatography on Superose 6. Samples for 
cryo-electron microscopy were applied to holey grids or to grids with a continuous 
carbon film and vitrified. Images were collected at 160 and 200kV on a Tecnai 
FEG 20 (FEI) with nominal magnifications of <40,000 and X52,000, using a 
TVIPS 4096 X 4096 charge-coupled device or film. Image processing and single- 
particle analysis were done with EMAN software”. Molecular docking was carried 
out with Chimera*® and MDFF’. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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METHODS 


Construction of plasmids and E. coli strains. Plasmids used in this study are 
listed and described in Supplementary Table 3. PCR reactions were performed with 
Phusion polymerase (New England Biolabs) or KOD polymerase (Novagen). The 
E. coli DH5« strain was used for all cloning procedures. 

pBAD(MazF)-NC100, a plasmid expressing a SecM-stalled nascent chain under 
an arabinose-inducible (ara) promoter, and the MazF endoribonuclease under a 
tetracycline-inducible (tet) promoter, has been described’. In brief, a DNA sequence 
coding for a 100-amino-acid nascent chain was placed after the ara promoter of 
pBAD His/C (Invitrogen). The nascent chain contains an N-terminal signal sequence 
derived from E. coli DsbA, a Myc-tag and a C-terminal translational arrest sequence 
from E. coli SecM. The SecM nucleotide sequence contains three “ACA’ sites (under- 
lined in the following sequence: 5’-TTCAGCACACCCGTCTGGATATCACAA 
GCACAAGGCATCCGTGCTGGCCCT-3’); MazF will cleave the mRNA at these 
positions and convert polysomes into monosomes. To keep MazF uninduced, a 
TetR repressor was expressed. The tetR gene from Tn10 was cloned and inserted 
immediately downstream of the B-lactamase gene of the plasmid (for bicistronic 
expression). A DNA sequence for a tet promoter followed by E. coli MazF was 
cloned and placed between tetR and the replication origin of the plasmid. pACYC 
EhG/Y(68C), expressing a SecE-SecG fusion protein and SecY(68C) from a con- 
stitutive promoter, was constructed as follows. DNA sequences coding for E. coli 
SecE (residues 2-127) and SecG (residues 2-110) were fused with a sequence coding 
for a His-tag linker (GGSDGHHGHHHHGHHGDSGG). The fusion construct 
also contains an N-terminal calmodulin-binding peptide (CBP) tag (MGSRWKK 
NFIAVSAANREFKKISGGG). The resulting (CBP-tag)—SecE-(His-tag)-SecG fusion 
construct was ligated into pACYC-SecYEG™, replacing the original SecE segment. 
Subsequently, the original SecG coding sequence from pACYC-SecYEG was 
removed by restriction enzyme digestion and re-ligation. For information on other 
plasmids, see Supplementary Table 3. 

E. coli strains containing chromosome modifications were generated using stan- 

dard 1Red recombination techniques*’. To construct an E. coli strain (EP71; BW25113 
Armf AompT rplL-strep::aadA(Str®)) in which ribosomal protein L12 (rplL) is 
C-terminally tagged with a Strep-tag (WSHPQFEK), we first synthesized a ‘rplL- 
strep-RBS-aadA’ DNA cassette, containing the C-terminal part of the rpiL gene 
followed by the Strep-tag, a stop codon, a ribosome binding site (RBS), the coding 
sequence of a streptomycin resistance gene (aadA) and a short sequence down- 
stream of the rpiL gene. This cassette was amplified by PCR and electroporated into 
Armf AompT cells (EP51)’° expressing 1Red recombinase from the pKD46 plasmid. 
The resulting cells were selected on agar medium containing 25 jig ml * strep- 
tomycin. Incorporation of the cassette into the chromosome was verified by PCR 
and immunoblotting using Strep-tag antibodies (Novagen). To delete the chro- 
mosomal secY gene (strain EP72), EP71 cells were first transformed with pKD46 
and pACYC EhG/Y(68C). After induction of ARed recombinase, the cells were 
electroporated with a PCR product containing a hygromycin resistance gene (hph), 
flanked by short sequences homologous to the chromosomal secY locus (so that the 
secY coding sequence is replaced by the hph coding sequence). Deletion of chromo- 
somal secY was verified by PCR. 
Preparation of SecY complex and ribosomes. All protein purification proce- 
dures were performed at 4 °C unless otherwise indicated. M. jannaschii and E. coli 
SecY complexes and E. coli 70S ribosomes were purified as described previously**. 
M. jannaschii cells were obtained from the University of Georgia Bioexpression 
and Fermentation Facility. M. jannaschii 70S ribosomes were purified by multiple 
ultracentrifugation steps as follows. Cells were homogenized in buffer containing 
50mM HEPES-NaOH, pH 7.5, 100mM KCl, 10mM MgCl and 1 mM dithio- 
threitol (DTT), using a French press. After removing cell debris by centrifugation 
(SS34 rotor; 1 h at 16,000 r.p.m.), the cell homogenate was loaded onto a sucrose 
cushion (containing 50 mM HEPES, pH7.5, 1M NH,Cl, 10mM MgCl, 1mM 
DTT and 30% (w/v) sucrose), and ribosomes were pelleted by ultracentrifugation 
at 45,000 r.p.m. for 5h (Beckman Ti50.2 rotor). The pelleted ribosomes were re- 
suspended in buffer containing 50 mM HEPES, pH 7.5, 1 M NH,Cl, 5 mM MgCl, and 
1mM DTT, and then sedimented by ultracentrifugation (SW-28 rotor, 24,000 r.p.m., 
12h) through a linear sucrose gradient (10-40% (w/v) sucrose in the re-suspension 
buffer). Fractions containing the 30S and 50S ribosomal subunits were collected 
separately and concentrated. The buffer was exchanged to 50 mM HEPES, pH 7.5, 
100 mM NH, Cl, 50 mM MgCl, and 1 mM DTT usinga 100-kDa cut-off AmiconUltra 
(GE Healthcare) device. 30S and 50S subunits were mixed at a molar ratio of 2:1. To 
purify 70S ribosomes from excess 30S subunits, the complexes were subjected to 
centrifugation (SW-28 rotor, 24,000 r.p.m., 12h) through a 10-40% sucrose gra- 
dient in 50mM HEPES, pH7.5, 100mM NH,Cl, 50mM MgCl, 1mM DTT. 
Fractions containing the 70S ribosomes were pooled, concentrated, and dialysed 
against buffer containing 50 mM HEPES, pH7.5, 100 mM NH,Cl, 10 mM MgCl, 
and 1 mM DTT. It should be noted that the resulting specimen contained an E-site 
tRNA at high occupancy. 
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Purification of disulphide-crosslinked E. coli RNC-SecY complexes. EP72 
(Armf AompT rpiL-strep::aadA AsecY::hph pACYC-EhG/Y(68C)) cells harbour- 
ing pBAD(MazF)-NC100 were grown to logarithmic phase in a medium contain- 
ing5 gl" trypton, 2.5 g1”' yeast extract, 10g] casamino acids and 5 g1~' NaCl. 
The expression of the nascent chain was induced by addition of 0.06% arabinose 
for 2h at 37 °C, followed by E. coli MazF induction with 100 ngml' anhydrote- 
tracycline for 30 min at 30 °C. Disulphide crosslinking between NC100(19C) and 
SecY(68C) was then induced by addition of 1 mM 5,5’-dithiobis-(2-nitrobenzoic 
acid) (DTNB) to the culture medium for 20 min. DTNB facilitates disulphide-bond 
formation between SecY and the nascent chain as efficiently as Cu-phenanthroline 
(CuPh;)'*. The cells were pelleted, washed once with buffer containing 50 mM 
Tris-HCl, pH 7.2, 5mM Mg(OAc)2, 150mM KCl, and frozen. RNC-SecY com- 
plexes were purified as follows. The cells were re-suspended in buffer containing 
50 mM Tris-acetate, pH 7.2, 25 mM Mg(OAc),, 0.3 M NH,Cl and homogenized 
with a French press. One per cent n-dodecyl-B-b-maltoside (DDM) was added to 
the cell lysate for 1h to solubilize membranes. After centrifugation (SS-34 rotor, 
13,000 r.p.m., 30 min), ribosomes containing Strep-tagged L12 were purified by 
applying the lysate to a Strep-Tactin Sepharose column (IBA). The column was 
washed with 8 column volumes (CV) of buffer containing 50 mM Tris-acetate, 
pH 7.2, 25mM Mg(OAc)2, 0.4M NH4Cl, 0.03% DDM, and then with 2 CV of 
buffer (TMP200) containing 50 mM Tris-acetate, pH 7.2, 25 mM Mg(OAc)s, 0.2 M 
KOAc, 0.03% DDM. Ribosomes were eluted from the column with 4 CV of the 
TMP200 buffer containing 4mM desthiobiotin. To enrich for channel-bound 
RNCs containing His-tagged SecE-SecG fusion protein, the eluate was incubated 
with Dynal-Talon beads (Invitrogen) for 30 min. The beads were washed three 
times with TMP200 buffer, and bound complexes were eluted with TMP200 buffer 
containing 120 mM imidazole. The complexes were further purified by gel filtra- 
tion on a Superose 6 column (GE Healthcare) equilibrated with buffer containing 
50mM Tris-acetate, pH 7.2, 10 mM Mg(OAc)2, 80mM KOAc, 0.03% DDM. 
Monomeric ribosome fractions were collected and concentrated to 8-9 mgml '. 
Test for in vitro reconstitution of the RNC-SecY complex. For the experiments 
shown in Supplementary Fig. 12, RNCs containing the DsbA 108}, or NC100 nascent 
chain were isolated as follows. pBAD-DsbA108,4;,(19C) or pBAD-NC100(19C) 
were transformed into Armf AompT cells (EP51) harbouring the pRARE2 plasmid. 
Cells were grown to log phase in 2 X YT medium (16 gl! tryptone, 10 gl‘ yeast 
extract and 5g] ' NaCl) supplemented with 100 jig ml * ampicillin and 40 jig ml 
chloramphenicol. Nascent chain expression was induced by addition of 0.4% arabi- 
nose for 3 h. The cells were re-suspended in buffer (TMA750) containing 50 mM 
Tris-acetate, pH 7.2, 25mM Mg(OAc),, 0.75 M NH,Cl and 1.5mM DTT and 
homogenized in a French press. To solubilize the membranes, 1% DDM was added 
to the cell extract. The extract was cleared by centrifugation at 13,000 r.p.m. for 1 h. 
The ribosomes were sedimented through a sucrose cushion (TMA750, 30% suc- 
rose, 0.03% DDM) and re-suspended in TMA750. The buffer was exchanged on a 
PD-10 desalting column (GE Healthcare) to buffer TMP100 (50 mM Tris-acetate, 
pH7.2,25 mM Mg(OAc),, 0.1 M KOAc). To purify RNCs containing monosomes, 
the ribosomes (OD ¢60nm = 500-1,000) in TMP750 were briefly incubated with 
20 ug ml~' RNase A at room temperature (23 °C) and immediately injected into 
a Superose 6 gel-filtration column (GE Healthcare) equilibrated with TMP100 
containing 50 mM Tris-acetate, pH 7.2, 25 mM Mg(OAc)2 and 100mM KOAc. 
Fractions containing monomeric ribosomes were collected. 

DsbA108}4;.- or NC100-containing RNCs (0.27 1M total ribosomes) were mixed 
with a 15-fold excess (4.1 1M) of the SecY(68C) complex in TMP100 containing 
0.03% DDM. When SecY-nanodiscs were used instead of SecY-detergent com- 
plexes, 0.138 1M of RNCs were mixed with a fivefold (0.7 1M) excess of SecY- 
nanodiscs in the same buffer lacking detergent. After incubating solutions at 4 °C 
for 1h or at 30 °C for 30 min, disulphide bridge formation was induced by addition 
of 0.1 mM CuPh; for 20 min at room temperature. The reaction was stopped by 
addition of 20 mM N-ethyl maleimide for 30 min at 4 °C. The samples were sub- 
jected to non-reducing SDS-PAGE and analysed by immunoblotting with Myc 
and SecY antibodies. 

Nanodiscs containing SecY (68C) complex were generated as previously described 
using the scaffold protein MSP1D1 (ref. 33). In brief, SecY(68C) complexes, MSP1D1 
and deoxyBigChap-solubilized E. coli polar lipid (Avanti Polar Lipids) were mixed 
in a molar ratio of 1:4:100 in 50 mM Tris-acetate, pH 7.2, 150 mM KOAc. After 
removal of the detergent with Biobeads (Bio-Rad), the sample was injected into a 
Superdex 200 column equilibrated with buffer TMP100. Fractions containing the 
SecY-nanodiscs complex were pooled and concentrated with an Amicon Ultra 
device (100-kDa cut-off). 

SDS-PAGE and immunoblotting. SDS-PAGE was performed using 4-12% Bis- 
Tris gels (Bio-Rad) with either MES-SDS or MOPS-SDS running buffer (Invitrogen). 
Images of immunoblots were recorded with a charge-coupled device (CCD)-based 
device (Fujifilm LAS-3000) and a standard ECL reagent. Antibodies against the 
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C terminus of SecY were described previously**. Anti-Myc and anti-CBP antibod- 
ies were obtained from Sigma and Genscript, respectively. 

Cryo-electron microscopy and three-dimensional image processing. M. jannaschii 
ribosomes were mixed with a fivefold excess of M. jannaschii SecYEB in 100 mM 
NH,Cl, 30 mM MgCl, 20 mM HEPES-KOH, pH7.5, 6mM {-mercaptoethanol 
and ~0.1% DDM. Samples were added to 400 mesh Cu grids with a holey carbon 
film (Quantafoil 2/1); ~2 pl per grid at an OD3¢60 of 60-120 or diluted and added to 
400 mesh grids with a thin continuous carbon film. After blotting, samples were 
plunge frozen into liquid ethane with a Vitrobot Mark 3 (FEI). Grids were mounted 
on an Oxford cold holder and imaged at 200 kV on a Tecnai F20. Data were col- 
lected manually on Kodak SO163 film at 50,000 with a defocus range of —1.0 to 
—2.5 um. Micrographs were scanned on Zeiss SCAI and Creoscitex EVERSMART 
scanners and particles selected with EMAN boxer*’ were binned and scaled to 
2.73A per pixel. In total, ~59,000 particles were corrected for the contrast transfer 
function (CTF) with EMAN2 and classified with a supervised multi-reference refine- 
ment into groups, with and without channel, to give a data set with ~37,000 particles 
that contained the channel. Three-dimensional reconstructions from six EMAN2 
refinements carried out with different parameters and estimated resolutions of 9.2- 
9.5 A (based on half-data set comparisons) were aligned in Chimera and averaged 
to obtain a final three-dimensional density map. 

Non-programmed E. coli ribosome-channel complexes were prepared for cryo- 
electron microscopy and imaged at 50,000 with a Gatan (626-DH) cold holder 
at 200kV, as described previously*. After identifying and removing complexes 
without channels, ~39,000 particles were processed with EMANI (ref. 35) at a 
pixel size of 2.73 A (for details see ref. 8). Aliquots of E. coli RNCs with SecYEG 
(OD 69 = 120-160 in ~0.06-0.1% DDM) were thawed and kept on ice. Samples 
were applied to 300 mesh Cu grids with a holey support film (Quantafoil 2/1 for 
imaging at < 42,000) and 400 mesh grids (Quantafoil 1.2/1.3 for imaging at * 50,000). 
The holey grids had a very thin layer of carbon freshly applied by evaporation and 
were airglow discharged before use. A Vitrobot or a manual plunger was used to 
plunge-freeze grids after blotting into liquid ethane, with the chamber at room 
temperature and a relative humidity of ~95-100%. Samples were loaded onto an 
Oxford cold holder and images obtained at 160 kV ona 4096 X 4096 CCD (TVIPS) 
with a semi-automated, single-particle collection program in EMtools (TVIPS) on 
a TF-20. Particle images were selected using e2boxer and further processed with 
EMAN2 (ref. 29). 

The CTF correction was based on all particles from each CCD frame (~450,000 
from ~3500 frames), including RNC-channel complexes that formed aggregates, 
after scaling data collected at X50,000 to 2.12 A per pixel. Subsequently, multiple 
cycles of reference free classification in EMAN2 were used to extract ~167,000 
single particles without close nearest neighbours for final processing. A ribosome 
at 25 A resolution, with and without the channel, was used as a starting model. The 
program e2refinemulti.py was used to separate the data set into two groups, which 
were refined separately to a resolution of ~11-12 A. A final supervised classifica- 
tion with e2refinemulti.py at an angular step size appropriate for 14 A resolution 
was then carried out with the full data set, using three-dimensional references with 
and without the channel filtered to 14 A. This step used the Fourier ring correlation 
comparator and provided an improved separation of the data set. At this stage 
~83,000 particles with channels from the supervised classification were sorted 
further with e2ligandclassify.py, on the basis of their signal-to-noise ratio, to give a 
final data set of ~53,000 particles. Two separate structure refinements were then 
done, starting with either the best three-dimensional reference from the original 
low-resolution ribosome model or using a 6.8 A resolution E. coli ribosome map 
(EMDB code, 5036) scaled to 2.12 A per pixel. After convergence, the four best maps 
(two from each structure path calculated with different refinement parameters) 
were aligned in Chimera and averaged to give the final three-dimensional map. 
Molecular modelling and docking. Maps from M. jannaschii and active E. coli 
ribosome-channel complexes were subjected to a local normalization in EMAN2 
to allow densities for ribosomal proteins, RNA, channel and micelle to be displayed 
and analysed using a single density cut-off. Maps were segmented with Chimera 
using Zone and difference map options (vop subtract)”. Small and large ribosomal 
subunit models were fit into the ribosome-channel density maps using Chimera fit 


in map option” and MDFF’ with runs of 500,000 steps (0.5 ns). Because no model 
was available for the M. jannaschii ribosome, we used a model of the related 
complex from P. furiosus (ref. 6, PDB ID, 3J20, 3J21 and 3J2L). Extra copies of 
ribosomal proteins and ribosomal RNA loops from the P. furiosus model that are 
absent in M. jannaschii were omitted. For E. coli rlbosome-channel complexes, a 
nearly complete model of the large ribosomal subunit based on electron micro- 
scopy modelling and a crystal structure (ref. 11, PDB ID: 3J01; ref. 12, PDB ID: 
212T) were used, along with a crystal structure of the small subunit (ref. 12, PDB 
ID: 212P). Models for tRNAs and mRNA were obtained from a crystal structure of 
a programmed T. thermophilus ribosome (ref. 36, PDB ID: 3I8G). 

The global resolution in experimental density maps was determined separately 
for the ribosome and channel in each structure using Fourier shell correlation 
(FSC) in EMAN2, with reference maps calculated from Protein Data Bank files of 
docked models. Reference maps were calculated with pdb2mre in EMAN at 7A 
resolution and aligned in Chimera to the appropriate experimental map, then 
saved with vop resample onGrid. Experimental maps of ribosomes, as part of their 
cognate ribosome-channel complex, had a soft mask applied after calculation in 
EMAN2. Density maps for channels were created by segmentation in Chimera 
which also effectively created a mask. However, no masks were created for refe- 
rence maps to prevent spurious correlations between similar masks in the FSC 
calculations between the two volumes being compared. The 0.5 criterion was used 
in all cases to identify the resolution. 

Models for closed and open E. coli SecYEG channels were constructed as follows. 
SecY in the closed channel was based on individual structural elements (helices and 
turns) from the crystal structure of T. thermophilus SecY*. These segments were 
docked onto the closed crystal structure of SecY from M. jannaschii’ in Chimera, 
on the basis of sequence alignments between the three organisms. Loops were then 
regularized and additional residues added as needed in Coot”’. SecE and SecG sub- 
units were taken from the crystal structure of T. maritima SecYEG”’. The structural 
model was then mutated to E. coli sequences, energy minimized with NAMD* and 
fit into the map with Chimera®® and MDFF’. A model for the open E. coli channel 
was constructed in a similar way, on the basis of a crystal structure of a partially 
open SecYE channel from P. furiosus°. SecY models were positioned initially in the 
maps by docking the 6/7 and 8/9 loops into their density with Rosetta’’. All MDFF 
runs with these components were done with segmented maps that contained the 
large ribosomal subunit and complete density for the channel and micelle. Models 
for the large subunit and SecY channel were minimized together. Importantly, the 
model of a partially open channel moved into correct density, to reveal the signal 
sequence helix and associated density for the nascent chain. Finally, no density was 
observed for the first two TMs of E. coli SecE, which are connected by an extended 
linker to the surface helix of this subunit, and thus may be flexible. 
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Structures of the Sec61 complex engaged in nascent 
peptide translocation or membrane insertion 
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The biogenesis of secretory as well as transmembrane proteins 
requires the activity of the universally conserved protein-conducting 
channel (PCC), the Sec61 complex (SecY complex in bacteria)’. In 
eukaryotic cells the PCC is located in the membrane of the endoplas- 
mic reticulum where it can bind to translating ribosomes for co- 
translational protein transport. The Sec complex consists of three 
subunits (Sec61a, B and y) and provides an aqueous environment for 
the translocation of hydrophilic peptides as well as a lateral opening 
in the Sec61a subunit that has been proposed to act as a gate for 
the membrane partitioning of hydrophobic domains’. A plug helix 
and a so-called pore ring are believed to seal the PCC against ion flow 
and are proposed to rearrange for accommodation of translocating 
peptides”*. Several crystal and cryo-electron microscopy structures 
revealed different conformations of closed and partially open Sec61 
and SecY complexes***. However, in none of these samples has 
the translocation state been unambiguously defined biochemically. 
Here we present cryo-electron microscopy structures of ribosome- 
bound Sec61 complexes engaged in translocation or membrane 
insertion of nascent peptides. Our data show that a hydrophilic 
peptide can translocate through the Sec complex with an essentially 
closed lateral gate and an only slightly rearranged central channel. 
Membrane insertion of a hydrophobic domain seems to occur with 
the Sec complex opening the proposed lateral gate while rearranging 
the plug to maintain an ion permeability barrier. Taken together, 
we provide a structural model for the basic activities of the Sec61 
complex as a protein-conducting channel. 

Ribosome-protein-conducting channel (PCC) complexes were formed 
using a well-established in vitro protein translocation system: a wheat 
germ translation extract combined with canine pancreatic endoplasmic 
reticulum membranes’. We chose nascent polypeptides derived from 
the leader peptidase (Lep) protein as intermediates. Dependent on 
the hydrophobicity of a variable region they show distinct translocation 
(LepT) or membrane insertion (LepM) behaviour”? (Fig. 1a). The pep- 
tides are 338 amino acids long, carry two transmembrane helices (TM1, 
TM2) followed by a 149 amino acid long lumenal loop containing a first 
glycosylation (GS1) site as well as a streptavidin and a haemagglutinin 
tag. The loop ends with the variable region that is either hydrophilic 
(LepT) or hydrophobic (LepM), followed by a carboxy-terminal stretch 
containing the second glycosylation site (GS2) and a ribosomal stalling 
sequence (cytomegalovirus (CMV ) upstream open reading frame (uORF) 
gp48) (Fig. 1a)'’”*. The length of the C-terminal stretch (93 amino 
acids) was chosen such that the variable region can fully engage the 
PCC when translation is stalled. Notably, the translocation state of these 
peptides can be precisely monitored by their glycosylation state: only 
one of two possible sites, that on the lumenal side (GS1) of the endo- 
plasmic reticulum membrane, is glycosylated when the nascent peptide 
is trapped in the ribosome-PCC complex in an intermediate state. 
Hence, this novel approach should provide us with bona fide transloca- 
tion intermediates that are functionally better defined than complexes 
that have been reconstituted in vitro from purified complexes. 


When the messenger RNAs were translated in the absence of mem- 
branes, we observed the expected stalled transfer-RNA-bound peptide 
as well as free peptide owing to ineffective stalling (Fig. 1b, c). In the 
presence of membranes, however, after translocation of the loop region a 
single glycosylation event led to one additional shifted peptidyl-tRNA 
band for both constructs, LepT and LepM. Because of inefficient stalling 
two additional bands indicated dual and single glycosylation events, 
respectively, for the free fully translocated LepT and membrane-inserted 
LepM peptides (Fig. 1b, c). The peptidyl-tRNA bands disappear after 
puromycin treatment as expected (Fig. 1b). After optimizing the trans- 
lation conditions for enrichment of stalled and glycosylated intermedi- 
ates, ribosome- and membrane-bound nascent peptides were isolated by 
membrane pelleting and mild detergent solubilisation followed by affin- 
ity purification via the streptavidin-tag in the nascent peptide. Western 
blot analysis of the purified sample indicated that the final fractions 
consisted of ribosome-nascent chain complexes (RNCs) highly enriched 
for mono-glycosylated tRNA-bound nascent peptide and Sec61 complex 
(Fig. 1b, c and Extended Data Fig. 1), which was confirmed by mass 
spectrometry. The purified complexes therefore represented bona fide 
translocation or membrane insertion intermediates. 

The purified LepT and LepM complexes were subjected to cryo- 
electron microscopy and single-particle analysis. Applying in silico 
sorting procedures to the data sets resulted in final reconstructions 
of the two programmed RNC-Sec61 complexes as well as an idle 
ribosome-Sec61 complex lacking peptidyl-tRNA. All structures were 
solved at resolutions between 6.9 and 7.8 A (Fig. 1d-f and Extended 
Data Fig. 2). Resolution measurements were further validated by mea- 
suring cross-resolution between channel densities and the obtained 
molecular models (Extended Data Fig. 3), resulting in a local resolution 
of approximately 7.5 A for all Sec61 densities, which allowed unam- 
biguous resolution of o-helical secondary structure in the channels. 

The presence of peptidyl-tRNA indicated a high degree of program- 
ming for the LepT and LepM intermediates. The densities correspond- 
ing to the PCC showed the central Sec61 protein surrounded by a 
mixed detergent lipid micelle essentially as observed before’. The trans- 
membrane segments, the proposed lateral gate and the plug helix of 
Sec61a were well resolved and, thus, allowed for unambiguous position- 
ing of homology models in all cases (Extended Data Fig. 4). Notably, an 
extra density belonging to an inserting transmembrane helix of LepM 
was observed in the lateral gate of the LepM-engaged PCC (Extended 
Data Fig. 4c). 

Overall, the mode of ribosome binding seemed to be very similar in 
all three structures (Extended Data Fig. 4). Cytoplasmic loops L6/L7 
and L8/L9 of Sec61« contact the universal ribosomal adaptor site, 
consistent with previous cryo-electron microscopy studies of ribo- 
some-bound Sec61 and SecY complexes”'*"*. This indicates that, in 
eukaryotes, the overall mode of ribosome binding is not directly depen- 
dent on the different modes of PCC activity. 

The conformation of the ribosome-bound idle Sec61 complex bears 
a strong resemblance to the conformation observed in the archaeal 
Methanococcus jannaschii SecYEB crystal structure’, in which the 
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Figure 1 | Generation and cryo-electron microscopy structures of 
translocating and inserting ribosome-Sec61 complexes. a, Line diagram of 
LepT and LepM constructs with distinct variable regions. aa, amino acids; HA, 
haemagglutinin; Strep, streptavidin. b, c, Analysis and purification of LepT and 
LepM intermediates. Western blots probing for haemagglutinin indicate 
(glycosylated) peptidyl-tRNA, unglycosylated as well as mono- and 


lateral gate is closed and the plug obstructs the central constriction of 
the channel. In the idle Sec61 complex the lateral gate is also closed 
(Fig. 2a); however, with the lumenal part of TM7 shifted slightly 
towards the amino-terminal half of Sec61 and with the plug also shifted 
by approximately 3.5 A towards the lumenal side when compared to the 
crystal structure. This movement is accompanied by a small rotation of 
TM10 of Sec61o (Extended Data Fig. 5a, b) and may explain changes in 
ion conductivity observed upon ribosome binding to Sec61”°. 
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Figure 2 | Models for idle and translocating Sec61. a, Model for the idle 
Sec61 complex with (left) and without isolated density (right). b, Model 

for the translocating LepT-engaged Sec61 complex. NC, nascent chain. 

c, d, Comparison between idle and LepT-engaged Sec61 complex. c, Left, view 
on the lateral gate; right, lumenal view focusing on the plug; d, left, side view 
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bi-glycosylated free peptides as illustrated in schematic drawings. An EndoH 
background band is indicated by an asterisk. d-f, Cryo-electron microscopy 
reconstructions of the idle 80S-Sec61 complex (d), the LepT-RNC-Sec61 
complex (e) and the LepM-RNC-Sec61 complex (f). Right panels show cut 
density to visualize the ribosomal tunnel and peptidyl-tRNA. 


In the engaged Sec61 complex containing the hydrophilic LepT inter- 
mediate, the conformation of Sec61 is slightly more open compared to 
the idle state, and is very similar to a previous cryo-electron microscopy 
structure of a RNC-bound mammalian Sec61 complex’ (Extended Data 
Fig. 5f). The lateral gate is partially opened, yet only by a 4 A lateral shift 
of TM7 of Sec61u (Fig. 2c). The observed rather small opening of the two 
halves of Sec61« during peptide translocation is in agreement with 
chemical crosslink studies'®. Here, TM7 and TM2 of a translocating 
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focusing on TM10 and the plug; right, cytoplasmic view of the LepT-engaged 
Sec61 complex. For LepT models the presence of the translocating peptide is 
indicated as a green dashed line or a green asterisk. The colour code for TM2, 
TM7, TM10 and the plug is given underneath. 
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SecY can be locked using cross-linkers with very short spacer length 
(between 2 and 5 A), strongly indicating that a rather closed conforma- 
tion of the lateral gate still allows for peptide translocation’®. 

In our LepT complex the plug did not show a detectable shift when 
compared to the idle Sec61 complex. However, immobilization of the 
plug has been shown to still allow unrestricted protein translocation in 
bacteria’’. In the LepT-engaged Sec61o the lumenal part of TM10 was 
shifted outward, away from the plug (Fig. 2d). This shift of approxi- 
mately 6 A would be sufficient to provide the required opening for the 
accommodation of an extended translocating peptide segment between 
the plug and TM10"*. This is in perfect agreement with previous cross- 
link data showing that the translocating peptide is in the immediate 
vicinity of the plug helix, TM10 and TMS of the SecY complex. More- 
over, this shift would also disrupt the aliphatic pore-ring by pulling one 
participating residue from TM10 out of the assembly. At the given 
resolution the extended and most likely flexible translocating peptide 
was not visible. Regardless, the observed conformation allows placing 
an arbitrary model peptide, even with bulky side chains, into the aque- 
ous interior of the PCC that traverses from the cytoplasmic to the 
lumenal side without any clashes (Extended Data Fig. 6). Furthermore 
we believe that subtle changes in the position of helices in the central 
channel (for example, TM10) can dynamically accommodate the geo- 
metry of virtually any translocating peptide. 

Taken together, when engaged in translocation of a hydrophilic 
nascent peptide, the Sec61 complex can adopt a conformation with a 
lateral gate opened only by a few Angstroms and a continuous central 
conduit provided without major displacement of the plug. 

In the Sec61 complex containing the hydrophobic LepM intermedi- 
ate the conformation of the complex showed an open lateral gate (Fig. 3, 
Extended Data Fig. 4c). TM2 and TM7 of Sec61a moved apart by 
approximately 12 A to create a gap that harbours a rod-like extra density 
corresponding to three to four turns of an o-helix. Previous biochemical 
data’*”” and cryo-electron microscopy studies of the SecYE complex*”° 
(Extended Data Fig. 5g, h) indicated the position of an engaged signal 
sequences or a signal anchor sequence in the lateral gate. Furthermore, 
it has been shown that canonical transmembrane domains can also 
be retained at the PCC by protein-protein interactions until release is 
triggered by translation termination or the arrival of another trans- 
membrane segment”. It is therefore likely that the observed density 
represents the helical LepM transmembrane segment that is still con- 
nected by the C-terminal linker to the tRNA. This would support the 
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Figure 3 | Model for the membrane 
inserting Sec61. a, Model for the 
inserting LepM-engaged Sec61 
complex with isolated density (left); 
middle and right, side view and 
cytoplasmic view of the Sec61 
complex. b, c, Comparison between 
LepM-engaged and idle Sec61 
complex (left) and LepT-engaged 
Sec61 complex (right) focusing on 
the lateral gate (b) and focusing 

on TM10 and the plug (c). The 
model for the inserting LepM 
transmembrane helix (TM) is shown 
( in green in a. The colour code for 
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hypothesis that nascent transmembrane domains are indeed inserted 
into the lipid bilayer through the proposed lateral gate*’®’° and that 
these domains adopt a helical conformation when partitioning from the 
PCC into the lipid phase***. 
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Figure 4 | Conformational transitions of Sec61 during co-translational 
protein translocation and membrane insertion. a, In the ribosome-bound 
idle state, the lateral gate of the Sec61 complex is closed and the central 
constriction is closed by TM10 and the plug. b, When engaged with a 
translocating peptide the lumenal part of TM10 moves outward. This creates a 
central opening for the hydrophilic nascent chain (green). Whereas the plug 
and TM7 remain unchanged, TM2 rearranges slightly, resulting in a partial 
opening of the lateral gate. c, Upon encounter of a more hydrophobic peptide 
stretch that is supposed to be inserted into the lipid bilayer as a transmembrane 
domain (stop-transfer sequence), the lateral gate opens up further. It 
accommodates the peptide segment in a helical conformation between TM2 
and TM7 to allow access to the lipid phase. Lateral gate opening and transfer of 
the hydrophobic peptide from the central aqueous channel into the lateral gate 
are accompanied by a concomitant inwards movement of the plug and TM10. 
Thereby, the ion permeability barrier may be maintained. 
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In the observed open conformation of the Sec61 complex the plug 
has moved towards the central constriction of the channel (Figs 3c, 4). 
This is consistent with fluorescence data indicating that the plug does 
not move into a more hydrophilic environment during transmembrane 
helix insertion”. In its new position the plug closes the void created by 
both the exiting peptide and the separation of the N- and C-terminal 
halves of Sec61o. In addition, an inward movement of TM10 towards 
the plug by 3 A was observed (Fig. 3c). This concomitant movement of 
plug and TM10 may contribute to maintain a sealed central channel 
when a hydrophobic stretch arrives at the PCC and leaves the aqueous 
interior to engage the lateral gate (Fig. 4). Notably, the conformation of 
the open Sec61 complex is most similar to that observed in the crystal 
structure of an idle archaeal Sec complex’ (Extended Data Fig. 5i) in 
which the crystal packing led to an opening of the lateral gate. 

Taken together, the specifically glycosylated stalled nascent peptides 
provide a solid biochemical foundation for our structural analysis of 
engaged ribosome-bound protein-conducting channels. Our models 
provide a basic structural framework on the conformational transi- 
tions that enable the Sec61 to function in peptide translocation as well 
as in membrane insertion of nascent polypeptides (Fig. 4). 


METHODS SUMMARY 


Bona fide translocating or inserting RNC-Sec61 complexes containing glycosylated 
LepT and LepM nascent peptides were generated using a wheat germ cell-free extract 
supplemented with dog pancreas membranes (puromycin/high-salt-treated rough 
membranes, PKRM) and purified signal recognition particle (SRP). Ribosome- 
containing membranes were pelleted, solubilized with digitonin and RNC-PCC 
complexes were affinity-purified essentially as described before’? using a streptavi- 
din-tag on the nascent peptide (see Methods). For cryo-electron microscopy, LepT- 
RNC-Sec61 and LepM-RNC-Sec61 complexes were vitrified and data were col- 
lected on a Titan Krios electron microscope (FEI). Single-particle analysis, three 
dimensional reconstruction and computational sorting were done using the SPIDER 
software package*!. Molecular modelling and visualization was done using the Coot”? 
and Chimera*’ software packages. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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Aprataxin resolves adenylated RNA-DNA junctions 
to maintain genome integrity 


Percy Tumbale'*, Jessica S. Williams'**, Matthew J. Schellenberg'*, Thomas A. Kunkel’? & R. Scott Williams! 


Faithful maintenance and propagation of eukaryotic genomes is 
ensured by three-step DNA ligation reactions used by ATP-dependent 
DNA ligases'”. Paradoxically, when DNA ligases encounter nicked 
DNA structures with abnormal DNA termini, DNA ligase catalytic 
activity can generate and/or exacerbate DNA damage through abort- 
ive ligation that produces chemically adducted, toxic 5’-adenylated 
(5’-AMP) DNA lesions**. Aprataxin (APTX) reverses DNA adeny- 
lation but the context for deadenylation repair is unclear. Here we 
examine the importance of APTX to RNase-H2-dependent excision 
repair (RER) ofa lesion that is very frequently introduced into DNA, 
a ribonucleotide. We show that ligases generate adenylated 5’ ends 
containing a ribose characteristic of RNase H2 incision. APTX effi- 
ciently repairs adenylated RNA-DNA, and acting in an RNA~-DNA 
damage response (RDDR), promotes cellular survival and prevents 
S-phase checkpoint activation in budding yeast undergoing RER. 
Structure-function studies of human APTX-RNA-DNA-AMP-Zn 
complexes define a mechanism for detecting and reversing adenyla- 
tion at RNA-DNA junctions. This involves A-form RNA binding, 
proper protein folding and conformational changes, all of which are 
affected by heritable APTX mutations in ataxia with oculomotor 
apraxia 1. Together, these results indicate that accumulation of ade- 
nylated RNA-DNA may contribute to neurological disease. 
Previous studies indicate that abortive ligation (Fig. la) may occur 
during attempts to repair DNA lesions generated by oxidation*” or alky- 
lation’®. We explored a much more abundant opportunity for abortive 
ligation, that is, during ribonucleotide excision repair (RER)*""’. RER is 


initiated when RNase H2 cleaves on the 5’ side ofa ribonucleotide found 
in a 5'-RNA-DNA-3’ junction (Fig. 1b, referred to hereafter as RNA- 
DNA junction). This event is estimated to generate more than 1,000,000 
nicked RNA-DNA junctions per cell cycle in mice’? and more than 
10,000 nicked RNA-DNA junctions per cell cycle in budding yeast". 
Our study was prompted by the fact that ribonucleotides are introduced 
into the nuclear genome at levels that are much greater than all known 
types of DNA damage combined, and evidence that DNA ligation in vitro 
is impaired at incised RNA-DNA junctions’"*. We compared the ability 
of human DNA ligase I to seal a nick containing canonical 3’-OH and 
5'-P termini to a nick containing a 3’-OH and a 5’-P attached toa rG 
that mimics a nick generated when RNase H2 initiates RER (Fig. 1b). 
Greater than 95% of the nicked DNA substrate containing the 3’-OH 
and 5’-P termini was ligated within 10 min. In contrast, the presence of 
a single ribonucleotide (rG) on the 5’ side of the nick (5’ RNA substrate, 
Fig. 1b) significantly impaired generation of the 39-nucleotide ligation 
product (<1% ligation at 10 min, Extended Data Fig. 1a). Ligase I 
processing of the 5’ RNA substrate also produced an additional species 
migrating at a size of ~20 nucleotides, that corresponds to a bona fide 
5’-adenylated product (5'-AMPRNA-DNA) (Fig. 1b and Extended Data 
Fig. 1b). The adenylated product comprises greater than 50% of all 
DNA ligase I catalytic events on the 5’ RNA substrate at all time points 
measured (Fig. 1cand Extended Data Fig. 1a). Also, human DNA ligase 
III and bacteriophage T4 DNA ligase, but not Escherichia coli NAD- 
dependent LigA, generated similar amounts of ribonucleotide-triggered 
abortive ligation products (Fig. 1c). Thus, incised RNA-DNA junctions 
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Figure 1 | Abortive ligation at RNA-DNA junctions is resolved by APTX. 
a, ATP-dependent DNA ligation: (1), ATP-dependent DNA ligase adenylation; 
(2), AMP is transferred to the DNA 5’ phosphate to form 5’-AMP; 

(3), alignment of a DNA 3’-hydroxyl with 5’-AMP within the ligase active site 
facilitates the nick-sealing reaction. Ligase encounter with distorting termini 
triggers abortive ligation. b, DNA ligation is aborted at RNA-DNA junctions. 


The red ‘R’ indicates the position of ribonucleotide. c, Quantification of total 
catalytic events producing sealed DNA ends (39-nucleotide product, blue bars) 
or abortive DNA adenylation (red bars) by DNA ligases. Mean + s.d. (n = 2 

replicates) is displayed for 60-min ligation reactions. d, Human APTX DNA- 
adenylate hydrolysis. Reactions contained 2nM human APTX and 10 nM of 
the indicated substrate. 
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are poor substrates for eukaryotic DNA ligase nick-sealing reactions, 
and also trigger abortive ligation at high frequency in vitro. 

Aprataxin deadenylase (APTX in mammals and Schizosaccharomyces 
pombe, and Hnt3 in Saccharomyces cerevisiae) reverses DNA adenyla- 
tion*». Inactivation of APTX in ataxia oculomotor apraxia 1 (AOA1)'*""” 
suggests that persistent adenylated DNA strand breaks drive cerebellar 
degeneration in neurological disease*. However, the molecular context 
for APTX deadenylation remains uncertain. To examine a potential 
role for APTX during RER, we compared steady-state kinetic para- 
meters for deadenylation by human APTX on gel-purified abortive 
ligation substrates arising from metabolism of RNA-DNA junctions 
(5’-AMPRNA-PNA) to those representative of abortive ligation on DNA 
single-strand breaks created by reactive oxygen species‘ (5’-AMP**?) 
(Fig. 1d and Extended Data Fig. 1c). Both substrates were efficiently 
processed with comparable rates (ka, = 0.31 versus 0.37 s 1) with cata- 
lytic efficiencies that are ~30,000-fold higher than those reported on 
nucleotide substrates'*. A ~6-fold higher k.a{/K, for 5'-AMPRNA-DNA 
versus 5’-AMP®**® indicates that human APTX displays an in vitro 
preference for the RNA-DNA-derived substrates. 

Both S. pombe Aptx and S. cerevisiae Hnt3“P™ also harbour 5'- 


AMP®N“PNA deadenylase activity (Extended Data Fig. 1d, e). To 
determine whether Aptx deadenylates abortive ligation products gen- 
erated at RNA-DNA junctions in vivo, we examined whether the 
phenotypes of budding yeast strains with varying capacity to incorp- 
orate and repair ribonucleotides were altered by Hnt3“?™ deficiency 
(Fig. 2). A M644G variant of the leading strand replicase, DNA poly- 
merase € (Pol ¢, encoded by the POL2 gene, see Extended Data Table 1), 
has increased capacity to incorporate ribonucleotides into DNA in vitro 
and in vivo’*’. We generated heterozygous diploids in which one copy 
of HNT3 was replaced with the NatMX4 marker. Tetrad analysis 
showed that although HNT3 is dispensable for growth in a wild-type 
Pol ¢ (POL2) strain, growth of pol2-M644G hnt3A haploids is severely 
impaired (Fig. 2a), with macroscopic colonies only observed after 
extended incubation (Extended Data Fig. 2a). 

We reasoned that the growth defect of the pol2-M644G hnt3A strain 
is linked to accumulation of persistent adenylated DNA strand breaks 
generated by DNA ligase processing of RNase H2 incised RNA-DNA 
junctions (that is, 5'-AMPRNA-PNA) To test whether RNase H2 activ- 
ity contributes to the impaired growth of the pol2-M644G hnt3 mutant, 
we sporulated and dissected a diploid strain homozygous for deletion 
of the gene encoding the catalytic subunit of RNase H2 (RNH201). 
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Notably, deleting RNH201 (rnh201A) largely mitigated the growth 
defect of the pol2-M644G hnt3A mutant (Fig. 2b, c). This observation 
indicates that incision of ribonucleotides in DNA by RNase H2 gen- 
erates an RER intermediate leading to production of 5’-AMP®N“-PNA 
that requires deadenylation by Hnt3“P™ (see model, Fig. 2e). 

Increased Rnr3 protein level is a sensitive indicator of S-phase check- 
point activation’*”®. An increased level of the Rnr3 subunit of ribonu- 
cleotide reductase was detected in pol2-M644G hnt3A cells (Fig. 2d, 
lane 6), but was reduced in the triple mutant pol2-M644G hnt3A 
rmh201A strain (lane 8) to a level equivalent to that of a pol2-M644G 
mh201A mutant (lane 7). This suggests that failure of Hnt3“?™ to 
deadenylate 5’-AMP®N“-P™4 lesions activates the S-phase checkpoint. 
We also tested hnt3A mutant strains for sensitivity to genotoxic stress 
caused by hydroxyurea (HU). HU treatment increases rNMP incorp- 
oration’® and induces replication fork stalling. Growth of the pol2- 
M644G hnt3A mutant on rich medium was slowed, and survival in 
the presence of HU was reduced (Extended Data Fig. 2b, c). Notably, 
deleting RNH201 reduced HU sensitivity to a level comparable to pol2- 
M644G rnh201A cells (Extended Data Fig. 2c). 

Next we examined the consequences of loss of Hnt3 function in 
yeast strains containing a Pol ¢ variant with reduced capacity to incorp- 
orate ribonucleotides, pol2-M644L (ref. 13). With fewer ribonucleo- 
tides in the genome, the pol2-M644L hnt3A mutant displayed normal 
growth (Fig. 2c) and was unaffected by deleting RNH201. The stark 
contrast between the consequences of loss of Hnt3 function in the pol2- 
M644G variant (high genomic ribonucleotides) versus the pol2-M644L 
mutant (reduced genomic ribonucleotides) is consistent with the model 
wherein Hnt3“?* deadenylates genotoxic abortive ligation intermedi- 
ates arising during RER of ribonucleotides incorporated by Pol ¢ dur- 
ing DNA replication (Fig. 2e). A genetic interaction between HNT3 
and RNH201 is not apparent in a POL2 strain, possibly because ade- 
nylated RNA-DNA junctions may be removed by alternative nucleo- 
lytic processing, for example, mediated by Rad27**"! and Mrel1/Rad50/ 
Xrs2%>s! nucleases®. 

Having implicated aprataxins in processing 5’-AMP in vitro 
and in vivo, we aimed to define the molecular basis for 5’-AMP®N4-PNA 
processing by human APTX. Structural analysis of the S. pombe APTX 
DNA complex revealed the architecture of the yeast Aptx HIT-Znf 
domain, and a basis for engagement of DNA ends*. However, the mole- 
cular basis for the APTX RNA-DNA interactions, and the mechanism 
of the APTX DNA damage direct reversal catalytic reaction, remain 
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Figure 2 | Yeast Hnt3“?* is critical for resolving abortive ligation 
intermediates that arise after incision at genomic ribonucleotides by RNase 
H2. a, Tetrad analysis of HNT3/hnt3::natMX diploids. 1-8 are tetrad 
dissections and A-D are haploid spore colonies. Right: day 3 microscopic spore 
colonies in the pol2-M644G hnt3A strains. b, Tetrad analysis of HNT3/ 
hnt3::natMX diploids in the pol2-M644G rnh201A background. Plates imaged 
at 3 days. c, Deletion of HNT3 in the pol2-M644G mutator confers a slow 
growth phenotype that is eliminated by deleting RNH201. Doubling times (D,) 
were calculated from cultures in the logarithmic phase of growth in rich 
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medium at 30°C. Average doubling time + s.d. are calculated from four 
biological replicates (eight for the pol2-M644G hnt3A genotype, *P < 0.0007; 
**P < 0.0011 (two-tailed t-test)). d, Immunoblotting of whole-cell extracts was 
performed using an antibody to Rnr3. e, RNase H2 cleavage at ribonucleotides 
incorporated during Pol ¢ leading-strand DNA synthesis leads to abortive 
ligation intermediates requiring APTX processing. Deletion of HNT3 (hnt3A) 
or APTX deficiency in ataxia oculomotor apraxia 1 (AOA1) creates persistent 
adenylated strand breaks. 
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Figure 3 | Recognition of adenylated RNA-DNA junctions by human 
APTX. a, Domain architecture of human APTX. The RNA deadenylase 
core used for structural studies maps to residues 165-342. b, X-ray structure 
of the human APTX-RNA-DNA-AMP-Zn reaction product complex. The 
APTX HIT domain (tan) and Znf domain (blue) are displayed as cartoon- 
representation helices (cylinders) and B-strands. DNA is displayed as magenta 


unclear (see Supplementary Discussion). The minimal catalytic domain 
of human APTX was mapped to residues 165-342 using deletion muta- 
genesis, limited proteolysis and deadenylation assays (Extended Data 
Fig. 3a-c). We then determined four X-ray crystal structures of: (1) an 
RNA-DNA-bound human APTX-5’-AMP-RNA-DNA~Zn quaternary 
product complex; (2) a mimic of an adenylated RNA-DNA processing 
enzymatic transition state; (3) a DNA-only bound human APTX-5’- 
AMP-DNA-~Zn quaternary complex structure; and (4) an AOA] mutant 
human APTX(K197Q) RNA-DNA bound quaternary product complex 
(see Supplementary Discussion, Extended Data Table 2 and Extended 
Data Fig. 3). 

The APTX «-f histidine triad (HIT) fold domain”! assembles with a 
DNA-binding Znf domain in human APTX RNA-DNA deadenylase 
(Fig. 3b and Extended Data Figs 4 and 5). Close interactions between 
the HIT and Znf subdomains mould both the active site and the extended 
RNA-DNA damage-binding surface (Fig. 3b and Extended Data Fig. 5). 
The 5’-adenylate binding pocket and 5’ -ribonucleotide interaction sur- 
faces localize to the intersection of the HIT and Znf domains (Fig. 3b, c). 
The APTX-bound RNA-DNA junction is significantly distorted from 
B-form DNA (Extended Data Fig. 5a). A two-point nucleic acid—protein 
interaction induces a ~15° bend in the RNA-DNA by anchoring the 
exposed 5’-terminal RNA base stack and the 5’-AMP lesion on one 
side, while engaging the opposite undamaged strand with an array of 
contacts from the Znf domain (Extended Data Fig. 5d—g). Biochemical 
studies revealed that APTX disrupts Watson-Crick base pairing of the 
adenylated base pair”. In our structures, DNA distortions and capping 
of the RNA-DNA base-stack by the HIT domain amino-terminal helix 
(a1) provide a possible mechanism for un-pairing of the terminal rGeC 
base pair to gain access to the lesion (Extended Data Fig. 5b-g). In the 
DNA-only bound human APTX structure, similar DNA distortions 
are observed, revealing that APTX processes adenylated RNA-DNA 
and DNA with an analogous mode of substrate engagement (Extended 
Data Fig. 6 and Supplementary Discussion). APTX sequesters the 5'- 
AMP lesion into a hydrophobic active site recess in an extra-helical 
conformation that is rotated ~180° relative to the RNA-DNA helical 
axis (Fig. 3c and Extended Data Fig. 5c). 

RNA-DNA damage detection and reaction chemistry are mediated 
by four stringently conserved APTX elements that converge on the 5’- 
ribonucleotide and 5'-AMP lesion: HIT helix «1, the ‘histidine triad’ 
H®H®H loop (where H is histidine and ® denotes a hydrophobic 


deadenylase core 


duplex, with a green 5’-ribonucleotide and yellow AMP lesions. ¢, Four 
conserved elements dictate interactions with the 5’-ribonucleotide (green) and 
5'-AMP (yellow with orange/red phosphate group). The B2-B3-loop (orange), 
HIT a1 (gold), Znf «3 (blue) and HDH®H loop (dark green) completely 
envelop and orient the 5’-adenylated ribonucleotide lesion for catalytic 
processing. 


amino acid), the ‘82-83’ loop, and Znf helix «3 (Fig. 3c and Extended 
Data Fig. 4). The two 5’-terminal nucleotides of the damaged strand 
are bound in an A-form conformation, consistent with an RNA-DNA 
processing role for the aprataxins. Multiple contacts bind a C3’-endo 
sugar-puckered 5'-rG, including ribose sugar-phosphate interactions 
from Tyr 195 and Lys 197 of the 82-83 loop, and aromatic base stack- 
ing from Trp 167 of HIT «1 (Fig. 3c and Extended Data Figs 3f and 5c). 
Cradling of the 2’-hydroxyl of the ribonucleotide with van der Waals 
interactions from Tyr 195 (82-83 loop) and Met 256 of the HDH®H 
loop further anchors the 5’-rG and aids in aligning the 5’-adenylated 
RNA terminus for catalysis (Fig. 3c and Extended Data Fig. 5c). Mutational 
studies underscore the importance of the 82-83 loop in substrate bind- 
ing and catalytic activity (Supplementary Discussion and Extended 
Data Fig. 6). 

The first step of the APTX reaction is proposed to generate a covalent 
enzyme-AMP intermediate’, via an enzyme-nucleic acid transition 
state that poses a significant challenge to protein structural interroga- 
tion. To trap this transition state, we developed reaction conditions 
under which APTX activity is inhibited when co-incubated with aden- 
osine, orthovanadate and a 5’-phosphorylated RNA-DNA junction 
duplex (Extended Data Fig. 3g, h). Reaction ofhuman APTX with these 
reagents in crystallo produced a mimic of the enzyme-RNA-DNA- 
AMP transition state intermediate for step 1 of a two-step deadenyla- 
tion reaction (Fig. 4a and Extended Data Fig. 3i, j). 

The H®H®H loop completely encircles the adenylated 5’-ribonu- 
cleotide lesion (Fig. 4), with His 260 covalently bonded to a pentavalent 
coordinate vanadium atom in the transition state mimic complex (Fig. 4a, d 
and Extended Data Fig. 3i). The transition state and product-bound 
structures support a two-step deadenylation reaction initiated by nucleo- 
philic attack of the scissile pyrophosphate by His 260 (Fig. 4d). Protein 
main-chain amides of Ser 255-Met 256 and salt bridging from His 201- 
His 262 stabilize this transition state (Fig. 4a). His 251 is ideally posi- 
tioned to protonate a 5'-P leaving group, and binds 5'-P together with 
Ser 255 and Lys 277. In the proposed reaction scheme (Fig. 4d), step 1 
generates an enzyme-AMP intermediate, which is then resolved via 
hydrolysis in step 2. Notably, in the product complex, a Na” ion (assigned 
by oxygen-ion bond lengths) with octahedral coordination binds between 
Ser 255 and 5’-P (Fig. 4b), indicating that although APTX activity is 
metal-independent, transient solvent cation binding stabilizes the pro- 
duct state, similar to third metal binding in DNA polymerase 1”. 


6 FEBRUARY 2014 | VOL 506 | NATURE | 113 


©2014 Macmillan Publishers Limited. All rights reserved 


LETTER 


a Transition state complex 


me 1 


<lip 67 ( (Gua1) 


Product complex A Cc 
(assembled active site) 


RNA | 
(Gua1)" DNA 


Product complex B 
(disassembled active site) 


* DNA DNA 
Gint69, 4 Gintes Lys277 wer 
Lys277 5p 
-- Leu ti His278, His278 
His278 xy Lys197 
\) 
. Ser281 
moore! ) Ser281 
HoHOH loop His201 His201) His201 
His258 
d _ x, Ser255 + e Trp1 oN hy 
f OH, 
‘L. "3 RNA-DNA (Guat) 
\ oO 
fe) 
ts His260 N ae. 
: = NN 
Hy ‘ RNA DNA wa 20H Lys277 
His260 ‘aad : 4,0 @ 
Vv NH*s. 4H 
Oo >» led | ae S- - | 
/ H is: 
NH N NY Ser281 
/ \ vane 0 
N So “OH 
L i Met256  Ser255 
is20t MX Transition state 
His258 [Product (assembled active site) 
. iadae euone lg Product (disassembled active site) 
“Hyg ee A, “on fae JRNA-DNA i 
His260 N: f HN 
: Pie 4 , |! H Pp \ ik197Q 
ao : raat Lt ® His260 [ON 0 io? * ae 


ra @ Lys277_~€ H 


ON 
wd 6S 
3 

His201 


Figure 4| RNA-DNA deadenylation reaction mechanism and APTX 
inactivation in AOA1. a, Human APTX-RNA-DNA-adenosine-vanadate 
transition state mimic complex active site. b, Product complex (assembled 
active site) with 5’-ribonucleotide (green) and 5’-AMP (yellow) bound in the 
substrate interaction cleft. c, Product complex (disassembled active site) 
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Human APTX is found in two markedly different conformations in 
the product-bound structure. The first conformation (the ‘assembled 
active site’ Fig. 4b and Extended Data Fig. 7a, b) has an intact active site 
characterized by close interactions between HIT «1 (Leu 171 and Trp 167) 
and the HDH®H loop, and correct positioning of His 260 for catalysis. 
This state has the His 260 imidazole ring hydrogen bonded to the His 268 
main-chain carbonyl oxygen. In the second state (the “disassembled 
active site’), «1 is displaced by ~4 A relative to a rearranged HDH®H 
loop, and His 260 is flipped out of alignment for nucleophilic attack 
(compare Fig. 4b and c). Structural overlays (Fig. 4e) and interpolations 
between these two states (Supplementary Videos 1 and 2) indicate that 
concerted conformational rearrangements sculpt the HOH®H loop, 
and may be linked to RNA-DNA substrate binding by «1 and HOH®H 
(Extended data Fig. 7b-e). We propose that interactions between RNA- 
DNA and protein proximal to the active site regulate active-site con- 
formations involving HIT «1. RNA/DNA-regulated assembly of the 
APTX active site may ‘license’ catalytic activity and also prevent inap- 
propriate, nonspecific hydrolysis of nucleotides (for example, ATP or 
ADP hydrolysis). Discrimination against ATP cleavage may be critical 
for mitochondrial APTX isoforms that have previously been implicated 
in DNA damage repair in mitochondria”, because off-target catalysis 
could imbalance nucleotide pool levels. 

Both missense and truncating APTX substitutions are linked to neuro- 
degenerative disease’*’’”. On the basis of the human APTX structures 
determined here, we predict that most AOA1 mutations (D185E, 
A198V, P206L, G231E, R247X, V263G, D267G, W279X, W279R and 
R306X) will decrease protein stability by truncating the polypeptide or by 
altering the protein-folding core (Extended Data Fig. 8a). Conformational 
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substrate interaction cleft. d, Proposed human APTX reaction mechanism. 

e, Structural overlays of human APTX states illustrate the coupled movement 
of the N-terminal «1 helix and the HOH®H active-site loop. f, Structural 
repercussions of the AOA1 APTX(K197Q) variant. A structural overlay of wild 
type (grey) and mutant K197Q (pink). 


differences between our RNA-DNA bound structures extend into the 
protein core (Extended Data Fig. 7a). APTX conformational changes 
may thus be subject to mutagenic modulation in disease. We posit that 
differential impacts on protein folding, active-site chemistry and sub- 
strate induced-fit active site assembly may all contribute to the variable 
clinical outcomes observed in patients with APTX defects’”. 

One AOAI mutation is found in the RNA-DNA substrate interaction 
cleft (K197Q) and two participate directly in active-site chemistry (H201R 
and H201Q) (Fig. 4a-c and Extended Data Fig. 8a). The late-onset 
AOAI variant APTX(K197Q)” displays significantly impaired dead- 
enylation activity on both the 5’-AMP**” and 5’-AMP®\“-P4 substrates 
(Extended Data Fig. 6b). To understand the molecular basis for the 
K197Q defect, we determined a 1.90 AX- -ray structure of APTX(K197Q) 
bound to RNA-DNA and AMP that reveals the mutant protein har- 
bours a distorted active-site pocket (Fig. 4f and Extended Data Fig. 8b, c). 
In the wild-type protein, Lys 197 participates in salt-bridging interac- 
tions with the 5’-terminal sugar-phosphate backbone and the AMP 
lesion 2'-hydroxyl. In the mutant, Gln 197 is rotated away from the 
substrate-binding pocket and substitutes direct protein—substrate inter- 
action with a protein—water-substrate nucleic acid binding interaction, 
thus revealing that distortions in the APTX( K197Q) substrate-binding 
pocket underlie AOA1. 

Our data indicate that during repair of non-canonical ribonucleo- 
tides introduced into DNA during replication of the nuclear genome, 
DNA ligases generate 5’-adenylated RNA-DNA junctions that can 
elicit a DNA damage checkpoint response unless this is prevented 
by APTX deadenylase. In addition to frequent ribonucleotide incorp- 
oration by DNA replicases, rNTPs are used by RNA primase to initially 
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synthesize ~5% of the nascent lagging strand, and rNTPs are also 
incorporated during mitochondrial DNA replication”, during trans- 
lesion synthesis’, and during DNA repair’’. Ribonucleotide incorpora- 
tion during DNA repair may be more prevalent in non-proliferating 
cells because dNTP concentrations are lower’””®, thereby increasing 
rNTP:dNTP ratios*’. Thus, the late onset of AOA1 might partly reflect 
failure to deadenylate RNA-DNA junctions resulting from ribonu- 
cleotides incorporated in DNA transactions occurring over many years 
in quiescent neurons. It will be important in future work to establish quant- 
itative measures of RNA-DNA adenylation to explore this hypothesis. 

In this context, APTX acts in a nucleic acid transaction that is not 
exclusively DNA or RNA. Instead, using a reaction mechanism that is 
finely tuned to operate on RNA-DNA junctions, APTX acts in an 
RNA-DNA damage response (RDDR) to protect the genome from a 
compound insult, a ribosylated, adenylated 5’ terminus. In a broader 
sense, it seems probable that other enzymes may also modulate the 
RDDR via the detection, processing and signalling of RNA-DNA- 
derived structures posing threats to genomic integrity. 


METHODS SUMMARY 


Proteins were expressed in Escherichia coli and purified with standard procedures. 
All crystals were grown using sitting-drop vapour diffusion. X-ray diffraction data 
were all collected at 100 K at the Advanced Photon Source, beamlines 22-ID and 
22-BM. Initial DNA-bound human APTX structures were solved by molecular 
replacement with the S. pombe Aptx-DNA complex (RCSB code 3SZQ). RNA- 
DNA-bound wild-type and mutant human APTX structures were solved by molecu- 
lar replacement using the refined human APTX-DNA-bound model. S. cerevisiae 
strain construction and growth assays were performed as described’. 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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Crystal structures of the Lsm complex bound to the 
3’ end sequence of U6 small nuclear RNA 


Lijun Zhou!*, Jing Hang***, Yulin Zhou', Ruixue Wan®, Guifeng Lu’, Ping Yin?, Chuangye Yan?" & Yigong Shi"? 


Splicing of precursor messenger RNA (pre-mRNA) in eukaryotic 
cells is carried out by the spliceosome’, which consists of five small 
nuclear ribonucleoproteins (snRNPs) and a number of accessory 
factors and enzymes’. Each snRNP contains a ring-shaped subcom- 
plex of seven proteins and a specific RNA molecule”*. The U6 snRNP 
contains a unique heptameric Lsm protein complex, which specifi- 
cally recognizes the U6 small nuclear RNA at its 3’ end. Here we 
report the crystal structures of the heptameric Lsm complex, both 
by itself and in complex with a 3’ fragment of U6 snRNA, at 2.8A 
resolution. Each of the seven Lsm proteins interacts with two neigh- 
bouring Lsm components to form a doughnut-shaped assembly, 
with the order Lsm3-2-8-4-7-5-6. The four uridine nucleotides at 
the 3’ end of U6 snRNA are modularly recognized by Lsm3, Lsm2, 
Lsm8 and Lsm4, with the uracil base specificity conferred bya highly 
conserved asparagine residue. The uracil base at the extreme 3’ end 
is sandwiched by His 36 and Arg 69 from Lsm3, through n-n and 
cation-1 interactions, respectively. The distinctive end-recognition 
of U6 snRNA by the Lsm complex contrasts with RNA binding by 
the Sm complex in the other snRNPs. The structural features and 
associated biochemical analyses deepen mechanistic understanding 
of the U6 snRNP function in pre-mRNA splicing. 

Four of the five snRNPs (U1, U2, U4 and US) share the Sm heptamer 
ring*. By contrast, the heptamer ring in the U6 snRNP contains seven 
Sm-like (Lsm) proteins: Lsm2, Lsm3, Lsm4, Lsm5, Lsm6, Lsm7 and 
Lsm8 (refs 5-7). The U6 snRNP participates in formation of the pre- 
catalytic spliceosome and the two cleavages of pre-mRNA splicing*”. 
U6 snRNP in yeast contains the Lsm2-8 heptamer, a 112-nucleotide 
RNA”, and Prp24 (refs 14-16). The 3.6 A resolution crystal structure 
of the Sm heptamer ring bound to U4 snRNA uncovered a conserved 
pattern of RNA recognition”. Another Lsm heptameric complex, Lsm1-7, 
which shares six components with the Lsm2-8 complex, functions in 
mRNA degradation pathway in the cytoplasm'*. The recombinant Lsm 
heptameric complexes have been successfully reconstituted in vitro 
through denaturation and refolding”. 

Weco-expressed all seven Lsm proteins from Saccharomyces cerevisiae 
and purified the heptameric Lsm2-8 complex to homogeneity (Exten- 
ded Data Fig. la). Consistent with the finding that the Lsm2-8 
complex specifically recognized a uridine-rich sequence derived from 
the 3’ end of U6 snRNA’, the U6 snRNA sequences from multiple species 
share four consecutive uridine nucleotides at their 3’ ends (Extended 
Data Fig. 1b). We synthesized seven RNA oligonucleotides, each derived 
from the 3’ end of S. cerevisiae U6 snRNA, and examined their binding 
to the Lsm2-8 complex using isothermal titration calorimetry (ITC) 
(Fig. 1a). The tetra-uridine oligonucleotide 5'-UUUU, 2-3’ bound to 
the Lsm2-8 complex with a dissociation constant of 694 + 106 nM (Fig. la 
and Extended Data Fig. 2a). The pentanucleotide 5’-GUUUU},)-3’ 
had a dissociation constant of 109 + 6 nM, a 6.4-fold increase over 5'- 
UUUU)});2-3’. Further extension of the 5’-sequence only led to slightly 
enhanced binding (Fig. 1a and Extended Data Fig. 2b-g). This result 
demonstrates that a short RNA oligonucleotide derived from the 3’ end 


Figure 1 | Structure of the Lsm2-8 heptameric complex. a, Five nucleotides 
at the 3’ end of U6 snRNA is responsible for the bulk of binding energy between 
U6 snRNA and the Lsm2-8 complex. The data summarized here represent 
the median of three independently performed isothermal titration calorimetry 
(ITC) experiments. Error bars represent s.d. The same applies to Figs 1b and 4a. 
b, Mutation of any of the five nucleotides at the 3’ end of U6 snRNA results 
in decreased binding affinity for the Lsm2-8 complex. c, Overall structure of the 
Lsm2-8 complex in two perpendicular views. 


of U6 snRNA constitutes a minimal RNA element for specific recog- 
nition by the Lsm2-8 heptameric complex. 

We investigated the sequence requirement. For the octanucleotide 
5'-UUCGUUUU})2-3’, each of the six bases starting from the 3’ end 
was replaced by a similar base — uracil by cytosine and guanine by ade- 
nine. The mutated oligonucleotides were examined for binding to the 
Lsm2-8 complex. Whereas the Lsm2-8 complex has a binding affinity 
of 52 + 7nM towards the wild-type (WT) octanucleotide, individual 
replacement of the four uracil bases at the 3’ end results in reduction of 
binding affinity by at least 4.5-fold (Fig. 1b and Extended Data Fig. 3a—d). 
Replacement of the fifth nucleotide from the 3’ end only reduced bind- 
ing affinity by twofold (Fig. 1b and Extended Data Fig. 3e). Replace- 
ment of the sixth nucleotide had little effect on binding (Fig. 1b and 
Extended Data Fig. 3f). These results indicate that the five nucleotides 
from the 3’ end of U6 snRNA may represent the optimal sequence for 
binding by the Lsm2-8 complex. 

Reasoning that extended hydrophilic sequences at the carboxy ter- 
mini of Lsm4/Lsm8 (Extended Data Fig. 4a) may hinder crystal pack- 
ing, we generated three Lsm complexes in which these sequences were 
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deleted. Truncation of both C termini had no significant effect on RNA 
binding (Extended Data Fig. 4b). We crystallized the truncated Lsm2-8 
complex in two different space groups, among which P2, yielded better 
crystals. The structure was determined at 2.8 A resolution by molecular 
replacement using the atomic coordinates of the Lsm1-7 complex (PDB 
code 4M75) (Extended Data Table 1 and Extended Data Fig. 5a-c). 

The Lsm2-8 complex has a doughnut-shaped structure, with approxi- 
mately 70 A in outer diameter, 15 A in inner diameter, and 45 A in thick- 
ness (Fig. 1c). Lsm3, Lsm2, Lsm8, Lsm4, Lsm7, Lsm5 and Lsm6 interact 
with each other to form a closed ring, with each component only con- 
tacting two neighbouring proteins. Each Lsm protein adopts a highly 
conserved Sm fold”, with a tilted §-sandwich stabilized by an amino- 
terminal o-helix. The B-sandwich contains five anti-parallel B-strands 
in one sheet and three B-strands in the other; the three-stranded B-sheet 
is capped on one side by the N-terminal o-helix. The other side of the 
three-stranded B-sheet interacts with the five-stranded B-sheet of a 
neighbouring Lsm protein to form a contiguous eight-stranded, anti- 
parallel B-sheet (Fig. 1c). 

This structural organization allows the main chain carbonyl and amide 
groups of neighbouring Lsm proteins to form multiple intermolecular 
hydrogen bonds (Extended Data Fig. 5d). Despite the preponderance 
of these main chain hydrogen bonds, the side chains from the seven 
Lsm proteins contribute heavily to the specific formation of the Lsm2-8 
heptameric complex. For example, Tyr 8 from Lsm8 donates a hydrogen 
bond to Asn 59 from Lsm2, whereas Ile 44 from Lsm5 and Val 11 from 
Lsm6 interact with each other through van der Waals contact (Extended 
Data Fig. 5e). 

Next, we crystallized the Lsm2-8 complex in the presence of an octa- 
nucleotide derived from the 3’ end of U6 snRNA in the space group C2. 
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The structure was determined by molecular replacement using atomic 
coordinates of the free Lsm2-8 complex (Extended Data Table 1). The 
RNA bases show excellent electron density (Extended Data Fig. 6a); 
correct assignment of the RNA sequence is confirmed by anomalous 
bromine (Br) signal from 5-Br-uracil of the nucleotide Uj 19 (Extended 
Data Fig. 6b). 

The RNA oligonucleotide is bound within the central hole of the Lsm2-8 
ring (Fig. 2a). The four consecutive nucleotides 5’-Uj99U149U111U 12-3’ 
are recognized by Lsm4, Lsm8, Lsm2 and Lsm3, respectively. These 
nucleotides follow a positively charged surface groove (Fig. 2b) and 
gradually veer towards one side of the Lsm ring, with the fifth nucleotide 
Gog away from the central hole and bound by Lsm7. Despite the presence 
of octanucleotide in the crystals (Extended Data Fig. 6c), unambiguous 
electron density was observed only for the five consecutive nucleotides 
at the 3’ end (Extended Data Fig. 6a). RNA binding only induces local 
conformational changes in the Lsm2-8 complex (Extended Data Fig. 
7). The side chains of Phe 35 and Arg 63 in Lsm2 re-orient to sandwich 
uracil from U;;;, whereas Arg 72 in Lsm4 relocates to form cation-7 
interactions with uracil from Ujo9. The side chain of Gln57 in Lsm7 
undergoes a rotation to hydrogen bond with guanine from Gyo. 

Each of the four uracil bases at the 3’ end of the U6 snRNA is speci- 
fically recognized by a conserved pattern of interactions, involving two 
sequence motifs DxXxN and IRGX (Fig. 2c). The Arg residue of the 
IRGX motif and the middle amino acid in DxXxN sandwich an RNA 
base through cation-7 and 1-7 interactions, respectively; the side chain 
of Asn in DxXxN and the amide nitrogen atoms of Gly-Xaa in the 
IRGX motif make direct hydrogen bonds to the base. Accordingly, the 
3’ end uracil from Uj, is sandwiched by the side chains of His 36 and 
Arg 69 from Lsm3, through cation-1 and n-1 interactions, respectively 
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Lsm1 71 ...CVERIYFS/EJEINKYAE.EDR..GIFMIRQENIVIVMLIGEVD... 
Lsm2 32 ISC/TDE|KKKYP.HLGSVRNIFITRIGSTIVIR YIVIYLNKNMV 
Lsm3_ 33 . AVETIYQLINNE/IELSE.SERRCEMVF|IRIQDTVITLIIISTPS... 
Lsm4 32 VPJINWMINLITILISNVTEYSEESAINSI|EIDINAESS . KAVKLNELY|IRIQTFII[KFIIIKLQDNII 
Lsm5 37 . -AVEWLIDP/EDEISSRNEKVMQHHGRMLILSEINNIIATILIVPGG... 
Lsm6 41 ATEHYES/NININIKLLN...KFNSDVFILRIGTQIVMY/IISEQKI.. 
Lsm7 55 YRJIQLMMILIVILIDDTVEYMSNPDDENINITIF|LISK.NARKLGLIVIRIQTIILIVSILISSAE. . . 
Lsm8 30 VENRISIRIGFICK . 2 gaa AQLILREIS BJALIVIGLID.. . 
was 


Figure 2 | Recognition of U6 snRNA by the Lsm2-8 heptameric complex. 
a, Overall structure of the Lsm2-8 complex bound to the 3’ end sequence of U6 
snRNA. Five nucleotides at the 3’ end are recognized in the central hole of the 
Lsm ring. Two perpendicular views are shown. For clarity, Lsm5 and Lsm6 
are removed in the right panel. b, The five nucleotides 5’-G,9g UUUU) 2-3’ 
bind to a positively charged surface region in the Lsm2-8 ring. The Lsm 
complex is represented by electrostatic surface potential. c, Schematic 


representation of the modular recognition of RNA bases by the Lsm proteins. 
The amino acids that sandwich the uracil base through n-n and cation—n 
interactions are indicated by red triangles and red asterisks, respectively. Uracil 
specificity is conferred by an invariant Asn (red diamond) and the di-residue 
GX of the IRGX motif (red connected arrows). d, Specific recognition of the 
nucleotide Uj 12. e, Coordination of the nucleotide Gjog. 
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(Fig. 2d). Specific recognition of uracil is conferred by two pairs of 
hydrogen bonds, one pair between the 3-NH/4-O groups of uracil and 
the side chain of Asn 38 and the other between the 2-O atom of uracil 
and the main chain amide groups of Gly70 and Asp71 (Fig. 2d). 

The other three uracil bases follow the same pattern of interactions 
(Extended Data Fig. 8). U, 1, is sandwiched between Phe 35 and Arg 63 
from Lsm2, through n-7 and cation-n interactions, respectively (Extended 
Data Fig. 8a). Notably, U; 9 no longer has the n-7 interactions but main- 
tains the conserved cation-n interactions involving Arg 57 from Lsm8 
(Extended Data Fig. 8b). In addition, there are three base-specific hydro- 
gen bonds for Uj 99 or Uj 19, compared to four such hydrogen bonds for 
U4, or Uj 12 (Fig. 2d and Extended Data Fig. 8). Thus, the two uracils 
at the 3’ end are coordinated by more interactions than the other two 
uracils away from the 3’ end. The gradual loosening of the interactions 
with Ujo9 and Uj; culminates in the departure of the fifth nucleotide 
Gjog from the central hole of the Lsm ring (Fig. 2a). Gog is no longer 
coordinated by the DxXxN or the IRGX motifs, with its guanine base 
stacked by Trp 35 from Lsm4 and Leu 58 from Lsm7 (Fig. 2e). 

The Lsm proteins share 26-40% sequence identity with the corres- 
ponding Sm proteins. Both heptameric complexes have a similar overall 
structure (Fig. 3a). The Lsm2-8 complex and the Sm complex recognize 
specific, but different, RNA elements (Fig. 3b). Recognition of indivi- 
dual RNA bases is quite conserved (Fig. 3c, d). The Sm proteins also 
contain the DxXxN and IRGX motifs and predominantly recognize 
uridine nucleotides. Each uracil base is sandwiched mainly by Arg and 
His/Phe through cation-1 and 1-7 interactions, respectively; the base 
specificity is conferred through hydrogen bonds by an invariant Asn 
residue (Fig. 3c, d). 

Despite similarity in recognition of individual bases, the overall mode 
of RNA recognition is quite different for the Lsm and Sm complexes. 
Recognition of U6 snRNA by the Lsm complex primarily involves ‘end 
recognition’, where the uridine nucleotide at the 3’ end of the RNA 
element is anchored by Lsm3 and the preceding three nucleotides are 
recognized by Lsm2/8/4 (Fig. 3b-d). By contrast, both U1 and U4 snRNAs 
are bound by the Sm complex through ‘internal RNA recognition’, where 
seven consecutive nucleotides are bound within the central hole of the Sm 
complex, each recognized by a distinct Sm protein, and the preceding 


Lsm heptamer 
Sm heptamer 


Lsm5 


Figure 3 | Structural comparison with the Sm complex. a, The overall 
structure of the Lsm2-8 heptameric complex (grey) is similar to that of the Sm 
complex (purple). The comparison was generated by aligning Lsm8 to SmB 
of the Sm complex, which has a root-mean-squared deviation of 2.5 A over 50 
Co atoms. b, The overall mode of RNA recognition is different between the Lsm 
and Sm complexes. The Lsm complex caps the 3’ end of the U6 snRNA. 
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and ensuing nucleotides are on two sides of the Sm ring’”** (Fig. 3b-d). 
Only four out of seven Lsm proteins in the Lsm2-8 complex specifi- 
cally recognize the uracil bases, explaining why four consecutive uri- 
dines contribute the bulk of binding energy (Fig. 1a). 

To corroborate the structural findings, we generated 21 Lsm hepta- 
meric complexes, each containing a missense mutation targeting a key 
residue for RNA recognition, and examined their binding to the octa- 
nucleotide 5’- UUCGUUUU-3’. Mutations in Lsm3 and Lsm2 are most 
deleterious, followed by mutations in Lsm4 (Fig. 4a). For these three 
Lsm proteins, any mutation of the conserved Asn or the two residues 
that sandwich the RNA base led to drastic reduction of binding affinity. 
By contrast, only one mutation Arg75Ala in Lsm8 resulted in similarly 
drastic reduction of binding affinity. Eight out of nine mutations in 
Lsm6/5/7 had relatively minor effect on RNA binding (Fig. 4a). The only 
mutation that had a pronounced effect, Arg74Ala in Lsm6, is explained 
by the structural observation that Arg 74 makes a hydrogen bond to the 
ribose of Uj. (Fig. 2d). These biochemical observations are in excellent 
agreement with our structural analysis, which reveals stronger interac- 
tions for Uy;2 and Uj, than for Uj19 and Ujog (Fig. 2d and Extended 
Data Fig. 8). The weakened interactions for U,19/U99 may be caused by 
a small degree of off-registry with respect to the 3’ end nucleotide, which 
accumulates over the second, third and fourth bases to disallow the fifth 
nucleotide Gjg to be accommodated within the Lsm ring (Fig. 4b). 

The synthetic RNA octanucleotide in our structural studies contains 
a 3’-OH group on the ribose of U,;9. In cells, however, U2 of mature 
U6 snRNA is marked by a ribose 2',3’-cyclophosphate group”. Presence 
of the cyclophosphate group was shown to slightly enhance the binding 
affinity for the Lsm2-8 complex”; this result was confirmed by our 
biochemical analysis (Extended Data Fig. 9). Thus the 3’ end cyclopho- 
sphate may only marginally strengthen RNA end recognition by the 
Lsm2-8 complex. We speculate that the negatively charged phosphate 
group is probably recognized by Arg74 of Lsm6, which mediates a 
direct hydrogen bond to the ribose 2'-OH group of Uj, in the crystal 
structure (Fig. 2d). Notably, this Arg, invariant in six Lsm proteins, is 
replaced by Ser in Lsm5, which borders Lsm6. This observation may 
explain why the 3’ end nucleotide U,,, is coordinated by Lsm3/Lsm6, 
but not Lsm6/Lsm5. 
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By contrast, the Sm complex recognizes seven consecutive nucleotides, with the 
preceding and ensuing nucleotides placed on two opposing sides of the Sm ring. 
c, Comparison of specific base recognition through hydrogen bonds 

between the Lsm complex (left panel) and the Sm complex (right panel). 

d, Comparison of base stacking interactions between the Lsm complex 

(left panel) and the Sm complex (right panel, PDB code 2Y9A). 
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Figure 4 | Lsm3 anchors the 3’ end of RNA elements. a, Differential 
contribution to U6 snRNA recognition by the Lsm components. 21 Lsm 
heptameric complexes, each containing a missense mutation targeting the 
base-stacking residues or the conserved Asn, were examined for binding to the 
octanucleotide 5’-UUCGUUUU),,-3’. WT, wild type. b, A proposed 
explanation for why the Lsm complex only accommodates four nucleotides. 
c, Structure of the Lsm2-8 heptameric complex bound to the RNA fragment 


The 3’ end recognition of U6 snRNP is unique among all RNA- 
binding proteins, with Lsm3 playing an essential role in anchoring the 
3'-uridine of Uj 12. To disrupt 3’ end recognition by Lsm3, we deleted 
Uj 12; crystallized the Lsm2-8 complex with the RNA 5’-UUUCGUU 
Uj41-3’, and solved the structure at 2.6 Aresolution (Fig. 4c and Extended 
Data Table 1). We had anticipated that this design might result in the 
binding of Gj9g-Uj099-U}10-U}1; to Lsm7-Lsm4-Lsm8-Lsm2, with U11; 
anchored by Lsm2 instead. To our surprise, the 3’ end uridine of Uj, is 
still anchored by Lsm3 in exactly the same manner as for U,;2 (Fig. 4d). 
In the structure, U1) and Ujo9 are recognized by Lsm2 and Lsm§, respec- 
tively. Intriguingly, Gi9g and C,97 are now accommodated in the same 
general location as Ujo9 of the previous structure, prying open the two 
blades of the sandwich — Trp 35 and Arg 72 from Lsm4 (Fig. 4e, f). 
This analysis reveals a striking ability for Lsm3 to anchor the 3’ end of 
bound RNA elements. 

The different modes of RNA recognition — end recognition versus 
internal RNA recognition — seem to match perfectly the different func- 
tions and assembly kinetics between the Lsm and Sm complexes. Unlike 
other snRNAs, U6 snRNA remains constitutively in the nucleus; the 
Lsm2-8 complex is pre-assembled before recognition of U6 snRNA 
in the nucleus”. By contrast, assembly of other snRNPs occurs in the 
cytoplasm, where the heptameric Sm complex can only be assembled 
in the presence of relevant snRNA”. Thus, the various snRNA may 
serve as a nucleation centre for assembly of the Sm-snRNA complex in 
the cytoplasm. Assembly of the Sm ring in vivo is facilitated by chaper- 
one proteins such as the SMN complex”®. The La protein is required for 
the maturation of U6 snRNA”. Similar to the Sm ring, assembly of the 
Lsm ring could also be facilitated by other yet-to-be identified chaper- 
ones. Our structural revelations, together with biochemical analyses, 
serve as an important framework for mechanistic understanding of the 
U6 snRNP function in pre-mRNA splicing. 


METHODS SUMMARY 


The seven proteins Lsm2-8 were individually cloned into the pQLink vector”* and 
co-expressed in Escherichia coli, purified to homogeneity, and crystallized by the 


Lsm7 Lsm4 Lsm8 Lsm2 Lsm3 


Lsm4 Trp 35 


Cc 
Lsm4 Arg 72 10F 


Lsm7 Leu 58 


Lsm7 Lsm4 Lsm8 Lsm2 Lsm3 


5’-UUUCGUUU,,,-3’. d, Structural comparison of the Lsm complexes bound 
to 5'-UUUCGUUU}),-3’ and 5’-UUCGUUUU},,-3’. e, A close-up view on 
the accommodation of the dinucleotide Cj97-Gjog by the same general location 
as that for Uyog in the wild-type complex. f, A cartoon representation of 

the recognition of the RNA fragment 5’-UUUCGUUU},,-3’ by the 

Lsm2-8 complex. 


hanging-drop vapour-diffusion method. Much protein engineering effort was direc- 
ted at obtaining the heptameric complex and improving the quality of crystals. 
Diffration data were collected at Shanghai Synchrotron Radiation Facility beam- 
line BL17U and SPring-8 beamline BL41XU and processed with HKL2000”. The 
crystals of RNA-free Lsm2-8 complex belong to the space groups 12,22, and P2,. 
The structure was determined at 2.8 A resolution by molecular replacement using 
the atomic coordinates of the Lsm1-7 heptameric complex (PDB code 4M75). The 
Lsm2-8 heptameric complex bound to an octanucleotide derived from the 3’ end 
of U6 snRNA was crystallized in the space group C2. The structure was determined 
by molecular replacement at 2.8 A resolution and refined with PHENIX”. Dissoci- 
ation constants for interactions between the Lsm2-8 complex and various U6 
snRNA fragments were determined by isothermal titration calorimetry (ITC). 


Online Content Any additional Methods, Extended Data display items and Source 
Data are available in the online version of the paper; references unique to these 
sections appear only in the online paper. 
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TELECOMMUTING 


No place like home 


Researchers can avoid stressful commutes and boost efficiency by working from home. 


BY KAREN KAPLAN 


Washington DC area. On a bad day, his 

103-kilometre round-trip commute 
means that he can spend more than four hours 
on the road. By the time he gets to his desk at 
NASA's Goddard Space Flight Center, Griffith 
— like millions of metropolitan commuters 
worldwide — feels drained. 

Griffith, who is chief support scientist for 
NASAs carbon cycle and ecosystems office in 
Greenbelt, Maryland, coordinates the North 
American Carbon Program at Goddard. His 
work involves computer analysis of remote- 
sensing and geospatial data, for which he needs 
long, uninterrupted blocks of time — almost 
impossible when the phone is ringing, e-mails 


P eter Griffith is not a fan of traffic in the 


are pouring into his in-box and people are 
knocking on his door. So to save commuting 
time and ensure solitude, he works from home 
twice a week. “I don't have a wet lab. I don't 
have an engineer,” says Griffith. “'m not one 
of the people at Goddard building a satellite. 
It’s just easier for me to do a lot of my work 
from home.” 

Griffith is one of a growing number of 
scientists around the world who are enjoying 
the benefits of working from home. Accord- 
ing to the US Census Bureau, about 13.4 mil- 
lion people in the United States worked from 
home for at least one day a week in 2010, up 
41% from a decade earlier. 

The practice is not new for researchers, who 
have long worked from home writing grant 
applications and research papers, grading 


exams or preparing lectures. But advances in 
technology are facilitating and accelerating this 
trend, allowing researchers to do more from 
home than ever before, especially if they work 
in bioinformatics or computational science 
(see Nature 504, 319-321; 2013). An Internet 
connection provides access to everything from 
e-mail to remote supercomputers; Skype and 
other computer programs enable inexpen- 
sive long-distance voice and video chats; and 
applications such as Google Drive allow mul- 
tiple users to access documents remotely and 
simultaneously. 

Early-career researchers who want to work 
from home will need to determine how to 
balance their obligations. A researcher may 
be required to develop a written proposal for 
their principal investigator, supervisor or > 
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> department head that explains why they 
want to work from home, the work they will 
do and their projected home-based schedule. 

At the very least, the researcher should be 
prepared to draw up talking points for a dis- 
cussion of the proposal. “You need to have a 
clear story,’ emphasizes Ferdinand Grozema, 
a chemist at the Delft University of Technology 
in the Netherlands. 

Early-career researchers should also clarify 
how they plan to address problems that may 
arise. For example, they may need to explain 
how a glitch with an experiment they are 
managing might be handled in their absence, 
or how they can attend a lab or department 
meeting virtually. And, of course, to work most 
successfully from home generally means that 
one should be conducting the type of research 
— like Griffith's — that does not require eight 
to ten hours in the lab each day (see ‘Comforts 
of home). 


BREATHING SPACE 

Researchers who work regularly from home 
cite quiet time and the absence of disruption 
as the primary benefits. 

Paul Bédard, a geochemist at the Univer- 
sity of Quebec in Chicoutimi, Canada, spends 
one day a week at home processing data sets. 
While there, he also prepares course mate- 
rial for classes that he teaches in mineralogy 
and geostatistics, and works on grant appli- 
cations and papers. “If I’m at the office, lam 
constantly getting a knock on my door from 
students or colleagues,” he says. “You need 
quiet time for more than a few minutes to do 
this work, and at home I have a few hours. You 
need breathing and thinking space — you 
need to let your brain wander around. That's 
where you find the solution, the answer.” 

Alison Diaper, who juggles jobs as a contract 
researcher in mental health and addiction at 
the University of Bristol, UK, and as a clinical- 
trials manager at Frenchay Hospital in Bristol, 
says, “I can escape random questions, other 


colleagues and the telephone ringing.” She 
works from home once a week or so, using the 
time to set up studies, analyse data and write 
up results for both jobs. 

There are also other practical considera- 
tions. Bédard is happy to avoid the commute 
through Quebec’s wintry climes. “T don’t have 
to drive to the university during a big snow- 
storm,’ he says. 

Marcel Swart, a theoretical chemist at the 
University of Girona in Spain, likes to avoid 
the tourist traffic in the summer that swarms 
in from the coast, east of his home in La Bisbal 
dEmporda. “They don't know where they’re 
going,” he says. “You can’t go more than 
30 kilometres an hour.” 


SETTING UP 

Aspiring scientist telecommuters need to 
notify managers, lab mates, colleagues and stu- 
dents of their home schedule well in advance. 
Researchers who are used to working remotely 
say that their regular notification routine 
includes sending out e-mails and texts, leav- 
ing voicemail messages and posting notices 
on their lab calendars and office doors at the 
beginning of the week — in some cases, up to 
ten days in advance — with their home-based 
schedule and contact information. 

If necessary, an early-career researcher 
should make it clear to colleagues, including 
managers and graduate students, that calls, 
texts and e-mails will receive responses only 
at certain times of the day. But veteran home 
workers say that it is crucial to have consist- 
ent access to colleagues, especially for recently 
appointed faculty members. 

Catherine Cardelus, a biologist at Colgate 
University in Hamilton, New York, has 6-12 
undergraduate students in her lab all year. 
When Cardeltis is out of the lab, she ensures 
that everyone knows her schedule and that 
she is available by phone or online. “I want 
research constantly done, so I make sure my 
students have what they want and need,” she 


TIPS FOR WORKING REMOTELY 


Comforts of home 


To be effective at working from home, keep 
these guidelines in mind. 

@ Determine which tasks are best done 

at the workplace. Working remotely using 
screen-sharing software often changes 

the dynamics of a collaboration, and team 
members should clarify who has the final 
say on changes, or try to produce final drafts 
in person. 

@ Set up your teleworking schedule to 
overlap as much as possible with those 

of your lab head, supervisor, colleagues 
and anyone else with whom you regularly 
interact. 


@ Determine what portion of your work is 
best handled by e-mail or online versus 
by phone. For detailed calculations 

that require input from colleagues, 

for example, e-mail is best. A written 
record helps to minimize errors and 
misunderstandings. 

@ Keep on top of tasks that need to be done 
in person at the lab or office, such as taking 
measurements or signing paperwork. 

@ Arrange regular coffees and lunches with 
colleagues and others while at the lab or 
office to catch up on informal workplace 
news exchanges. K.K. 
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LES TODD/DUKE PHOTOGRAPHY 


says. “If you want an 
active lab, you have 
to be accessible. My 
students can text 
me with questions 
such as, ‘When are 
you going to be 
back in the lab?’ or 
‘How do we order 
some HC]?’ I make 


| 


sure that what they 
> need is always there, 
Tm not one and that’s what has 
of the people allowed me to work 
at Goddard at home when I do” 
building a If she needs to stay 
satellite. It’s off e-mail or her 
just easier for mobile for an hour 
metodoalotof —_ortwo, she does so, 
my work from but provides ample 
home.” warning that she 
Peter Griffith will be unavailable. 


Getting used to 
providing an open line of communication 
and a transparent schedule may be an adjust- 
ment for researchers who have been accus- 
tomed to more autonomy, she warns. “The 
biggest shocker for most early-career faculty 
members is how hard it is to be able to stay 
at home because people rely on you to be in 
your lab and your office.” 

Depending on the institution, there may 
be thorny or murky policy issues on telecom- 
muting to contend with. When Grozema’s 
first child was born and he wanted to work 
from home, he elected to take a day’s pater- 
nity leave per week for about one-third less 
pay for that day. 

But when his second child arrived about a 
year ago, and Grozema considered working 
from home again, he discovered that many of 
his colleagues regularly worked from home 
without having to take leave and get paid 
less — the policy was not well defined. He 
approached his department head, and the 
two worked out an agreement under which 
Grozema uses a half-day’s leave per week 
when he works from home. 

Once remote workers have settled on 
a schedule, they need to stick to it, say 
researchers. If time at home provides the 
luxury of several hours without interrup- 
tion, an early-career researcher needs to use 
that time to actually do work — many warn 
that it is all too easy to give in to the siren 
song of smartphones and social media. “You 
have to motivate,’ says Diaper. “You have to 
be strict and say to yourself that you have to 
get the job done. You can’t be swayed by your 
partner's request or your own temptation.” 


DEALING WITH DOWNSIDES 

There are other pitfalls for those who work 
from home, including the possibility of a 
lower profile because of reduced visibil- 
ity. Cardelus says that it is wise to interact 


regularly and often in person with colleagues, 
associates and superiors. Working from 
home “can be very isolating”, she says. “You 
need to be networking — you need to be 
seen.” 

Some ways of counteracting the potential 
‘out of sight, out of mind’ problem include 
securing a mentor who is particularly sympa- 
thetic to junior researchers’ telecommuting 
and career-support needs. An understand- 
ing mentor might help to keep a home 
worker’s profile high by routinely talking 
up their work, thus mitigating the impact of 
decreased visibility. 

People who work from home do risk miss- 
ing impromptu chats, which can do more 
than just provide entertainment or build 
rapport — they offer access to unofficial intel- 
ligence that is a key part of understanding 
the changing dynamics of every workplace. 
“When I’m home, I miss out on going to have 
coffee with people, and that’s when all kinds of 
information about employment applications, 
the ministries and the university comes up,” 
says Swart. “If I’m not there, I don’t go out — 
and this kind of information is never shared 
on e-mail” 

It can be a chal- 
lenge, lament early- 
career researchers, 
to deal with the 
lack of connec- 
tion with lab mates 
and department 
colleagues. “You 
don’t have people 
around you to talk 
about what you're 
doing,” says Rich- 


“Ican escape ard Lonsdale, a 
random postdoc in compu- 
questions, other tational chemistry 
colleagues and at the University of 
the telephone Bristol, UK. Lons- 
ringing.” dale, who lives two 
Alison Diaper hours’ drive away 


in Birmingham, 
has been working from home for four days 
per week, and makes sure that he regularly 
e-mails colleagues and sets up Skype chats 
to confer about ideas when he is at home. 
He also arranges in-person discussions and 
meetings for days on which he comes in to 
the university. “You have to make the most 
of the day when youre in the lab,’ he says. 
Scientists who routinely work from home 
agree that it takes effort to counterbalance the 
downsides. But that is not a deal-breaker, they 
say. “It’s not unpleasant to be at a bit of a dis- 
tance,’ says Grozema, who adds that a day of 
telecommuting per week has helped with his 
work-life balance. “You don't have to be less 
productive.” m 


Karen Kaplan is the associate Careers editor 
at Nature. 
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METRICS 
Blog citations count 


Papers that are formally cited by research- 
oriented blogs receive more journal 
citations, finds a study published on 

15 January (H. Shema et al. J. Assoc. Inf. 
Sci. Technol. http://doi.org/q88; 2014). For 
7 of the 12 scientific journals examined 

in 2009, and 13 of 19 journals analysed in 
2010, papers cited in blog posts aggregated 
by ResearchBlogging.org received more 
subsequent citations than did papers from 
the same journal in the same year that had 
not been cited by blogs. Hiring and tenure- 
review committees could use blog citations 
to assess the impact of recently published 
papers, suggests co-author Hadas Shema, 
an information scientist at Bar-Ilan 
University in Ramat-Gan, Israel. 


TRAINING 


Doctorates diversify 


Leading European Union (EU) research 
universities are adding career development 
to their doctoral programmes, including 
schemes to help postgraduates into 
non-academic careers, finds a 27 January 
report by the League of European Research 
Universities (LERU) in Leuven, Belgium. 
Institutions are increasingly offering 
options including employer-led career- 
skills workshops, employment forums and 
fairs, student consultancies and internships 
with industry, it found. A LERU report 
four years ago called for such expansion 

in the face of declining academic research 
positions and a tight economic climate. 
Doctoral students sometimes do not 
appreciate the rare number of academic 
posts, and institutions need to offer 
guidance for alternatives, says Katrien 
Maes, LERU’s chief policy officer. 


SCHOLARSHIPS 
Trust funds PhDs 


The Leverhulme Trust, a non-profit 
research funder in London, will invest 
£10 million (US$16.6 million) to create 
150 doctoral scholarships across all UK 
science and humanities disciplines. Each 
award will be for £70,000 over 36 months. 
Universities can opt to offer extra funding 
to awardees, says trust spokesman Daniel 
Mapp. The scheme is meant to help those 
with undergraduate debt to pursue PhD 
degrees, but winners do not have to aim 
for any one professional path. “It will be 
for individuals to decide how they take 
their careers forward,’ Mapp says. Anyone 
at a UK university is eligible, but UK and 
European Union students get priority. 
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VESSEL 


S FOR DESTRUCTION 


BY A. G. CARPENTER 


uhmughmuhmuh. The hushed 
babble of the spectators crests 
as the guards bring the prisoner 


into the hall. 

Patron Jamis looks up at the gallery, 
stern, and the mutters fade. 

The girl looks like all the rest of her 
kind, dirty and scarred, but she has the 
nine bands of Authority tattooed on 
her forehead and she stands straight, 
even under the weight of her chains. 

Jamis clasps his hands behind his 
back, considering. They've had other 
Destructives before, but never one 
with the full nine. It means she is a 
leader and a prophet. Maybe even the 
leader and prophet. 

He clears his throat. 

“We have ways of making you talk” 

She grins. “We have ways of making 
you talk.” The words are sing-song. 

One of the guards steps forward, 
truncheon raised, but the girl doesn't 
flinch and Jamis shakes his head, 
motioning the guard away. 

“It will be more pleasant for everyone if 
you cooperate.” 

She shrugs, awkward with her arms 
bound to the pole across her shoulders. 
“Tam cooperating. I let you take me in” 

“You were gravely outnumbered.” 

“Because I chose to be.” A quick shake of 
her head and the irritated quirk in her mouth 
relaxes. “Tell me what you want to know” 

Jamis frowns. This feels wrong. Easy. Not 
at all like he has anticipated. 

“Well?” she asks. 

“They say you can see the future.” 

“Yes? 

“Yes that’s what they say, or yes you can?” 

She raises an eyebrow. “Yes, I can” 

The spectators take a collective breath; the 
noise breaks on the high curve of the ceiling 
like water and falls back in bits and pieces. 
Ahhh-ahhs-sh-shs. 

“Then you must know why I have brought 


you here.” 

“I do? A pause. “Oh. You want me to tell 
you?” 

“Yes, I want you to tell me.” His voice is 
rough with annoyance. 

“You want me to 
> NATURE.COM tell you if you will suc- 
Follow Futures: ceed in wiping out the 
Y @NatureFutures Destructives.” 

Ei go.nature.com/mtoodm Ahhh-ahhs-sh-shsh. 


The price of perseverance. 


Jamis waits for the echoes to fade. “Yes.” 
Another shrug. “Of course.” 

His heart hammers in his chest. 

“Of course?” 

“You dont believe me.” 


“How can you be certain of this?” 

She looks at him and there is the hint of 
deep water in her eyes. Fathomless. Calm. 
Dangerous. “I have seen the future of man- 
kind, Patron Jamis. It is death.” 

“We will persevere to the bitter end.” It is 
an automatic response. 

“Of course you will” She coughs, licks 
the spittle from her lips, wincing when her 
tongue finds a bloodied split. 

The spectators rumble, uneasy. 

Something cold and prickly writhes in 
Jamis’s gut. Guilt or fear or anger, he can't 
tell. 

The girl leans closer, something like pity 
making lines around her mouth. “It’s okay. 
You can't help it. None of us can” 

“Everyone wants to survive.” 

She shakes her head. “Everyone wants 
to die.” 

Muhmughmuhmuh. 

He raises a hand and the spectators hush. 
“You speak lies.” 

“Tt is a part of our DNA. The Romans lined 
copper cups with lead because it made the 
wine taste sweeter. The European settlers in 
America farmed tobacco. You created vac- 
cines for every ailment known to man, till 
your bodies could no longer defend against 
a live disease.” 

“We didn’t know any better at the time” 

“No.” She sighs. “We cannot help it” 
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His gut twinges, but he is not one of her 
disciples. “We will survive.” 

She says nothing. 

“We are not mere animals. We were made 
for more than that? 

“Our souls, you mean.” She smiles, 
sunlight on desert sand. “Maybe. But 
this...” She waggles her fingers. “This 
is just a vessel built for destruction.” 

The room is like a bell, clamouring 
with the uncertainty of the crowd in 
the gallery. 

Jamis silences it with a single gun- 
shot. It is a shame to waste a bullet 
on the girl when a knife would have 
done just as well, but it would not have 
caught their attention in the same way. 

He tucks the pistol back into its 
holster. “Take that out.” 

The guards drag the body away and 
Jamis turns to face the gallery. 

Hundreds of eyes burn at him, 
candle flames flickering against the 
dark nothingness. 

“Tfit is destruction they want, they 
shall have it?” 

They sigh in agreement. 

Yesssssssssssss. 

“We will fight against this seed of despair 
until it is wiped from the face of the Earth” 

Yesssssssssssss. 

“Go and burn them out.” 

Thunder shakes the room, feet pounding 
the metal floor as they trample out of the 
doors. Men, women and children all bent to 
one purpose. 

Jamis returns to his room. His face itches 
and he scratches it, absent. His fingers come 
away speckled with blood. Something in his 
gut stirs, uncomfortable. Even in death the 
girl is filthy. 

He washes and waits for the army to return. 


Three days later they find him in his room, 
bright-eyed with fever. His cheeks are spot- 
ted with black. His hands, too. 

The guards draw back, a single word on 
their breath. 

Pox. 

Jamis nods and smiles, mad with certainty. 
“T have seen the future.” He coughs, lips wet 
with spittle. “It is death.” = 
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